当前位置: 首页 > news >正文

Linux C/C++ 学习日记(47):dpdk(八):UDP的pps测试:内核 VS dpdk

注:该文用于个人学习记录和知识交流,如有不足,欢迎指点。

PPS 是 Packets Per Second 的缩写,中文意为每秒数据包数,是衡量网络设备或程序数据包处理能力的核心指标,具体指单位时间(1 秒)内成功发送或接收的数据包数量。

在本文中:pps:每秒发包的个数

一、发送100万个UDP包的测试结果

dpdk:

sudo ./build/ustack -l 0 -n 2 -- --sip 192.168.248.135 --sport 8080 --dmac 00:50:56:c0:00:08 --dip 192.168.248.1 --dport 8080 --count 1000000 --udp

每秒发包数为178126个左右

posix API:

./normal_udp_send_tool

每秒发包数位72364个左右

二、结果论述

pps对比:

  • dpdk:178126
  • posix AP:72364
  • 前者大约是后者的2倍

三、原因

1、DPDK 发包的 “性能突破点”

DPDK 是专为高性能网络设计的用户态框架,核心优势在于绕过内核协议栈,直接在用户空间完成数据包的构造、发送,从而消除了一系列内核级开销:

  1. 无内核上下文切换:传统 Socket 发送数据包时,需从用户态切换到内核态(系统调用开销),DPDK 完全在用户态操作,无此开销;
  2. 无内存拷贝:传统 Socket 需将用户空间数据拷贝到内核空间(再由内核发往网卡),DPDK 通过 “大页内存”“mbuf 内存池” 等机制,实现用户空间直接向网卡硬件缓冲区写数据,避免多次内存拷贝;
  3. 批量操作优化:DPDK 支持批量构造、批量发送(如代码中的 BURST_SIZE),将多个数据包的开销 “摊薄”,进一步提升效率;
  4. 轮询式网卡驱动:DPDK 用 “轮询(Polling)” 代替传统网卡的 “中断” 模式,消除了中断处理的延迟和开销,适合高吞吐场景。

2、标准 UDP Socket 的 “性能瓶颈”

标准 Socket 基于内核协议栈实现,每一步操作都存在固有开销:

  1. 系统调用开销:调用 sendto 时,需从用户态切换到内核态,完成权限校验、参数解析等操作,单次调用虽快,但高频调用时开销显著;
  2. 内核协议栈处理:内核需完成 UDP 封装、IP 封装、校验和计算等一系列协议逻辑,且这些操作是 “逐包串行” 的,无法像 DPDK 那样批量优化;
  3. 内存拷贝开销:用户空间数据需拷贝到内核空间,再由内核拷贝到网卡硬件缓冲区,两次拷贝带来额外耗时。

3、数据对比的直观结论

从测试结果看:

  • DPDK 发包速率约 17.8 万 PPS
  • 标准 UDP Socket 发包速率约 7.2 万 PPS
  • 前者是后者的 2.4 倍以上,且随着发包量增大、数据包变小,这个差距会更明显(小包场景下,DPDK 的 “批量优化” 和 “无拷贝” 优势更突出)。

小包场景下优势更明显的原因:

  • 大包和小包,内核拷贝的速率是差不多的,内核更多的消耗在于的是内核与用户态数据之间的交互(这个损耗的时间远大于拷贝数据大小的影响)。
  • 而dpdk由于是无拷贝的,即没有与内核的数据交互,其将数据写到内存池所需的时间主要受拷贝数据大小的影响。
  • 因此说,小包场景下,dpdk的优势会更明显

简言之,DPDK 通过 “用户态直连网卡、批量操作、无内存拷贝” 等设计,彻底消除了内核协议栈的开销,因此在数据包吞吐能力上远优于依赖内核的标准 Socket 方案。

四、测试代码

nomal_udp_send_tool.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/time.h>
#include <unistd.h>#define SEND_DATA "Hello, World!"  // 与DPDK版本保持一致的发包数据
#define TARGET_IP "192.168.248.1"  // 目标IP(请替换为实际测试IP)
#define TARGET_PORT 8000           // 目标端口(请替换为实际测试端口)
#define SEND_COUNT 1000000          // 总发包数// 计算时间差(毫秒)
#define TIME_SUB_MS(tv1, tv2) ((tv1.tv_sec - tv2.tv_sec) * 1000 + (tv1.tv_usec - tv2.tv_usec) / 1000)int main() {int sockfd;struct sockaddr_in dest_addr;char send_buf[1024] = "Hello,World!";strncpy(send_buf, SEND_DATA, strlen(SEND_DATA));// 创建UDP套接字if ((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) {perror("socket creation failed");exit(EXIT_FAILURE);}// 设置目标地址memset(&dest_addr, 0, sizeof(dest_addr));dest_addr.sin_family = AF_INET;dest_addr.sin_port = htons(TARGET_PORT);if (inet_pton(AF_INET, TARGET_IP, &dest_addr.sin_addr) <= 0) {perror("inet_pton failed");exit(EXIT_FAILURE);}struct timeval tv_begin, tv_end;gettimeofday(&tv_begin, NULL);// 循环发送数据包int sent = 0;while (sent < SEND_COUNT) {ssize_t send_len = sendto(sockfd, send_buf, strlen(send_buf), 0,(struct sockaddr*)&dest_addr, sizeof(dest_addr));if (send_len < 0) {perror("sendto failed");break;}sent++;}gettimeofday(&tv_end, NULL);int time_used = TIME_SUB_MS(tv_end, tv_begin);int success = sent;// 输出结果printf("===== 标准Socket UDP发包详情 =====\n");printf("目标IP: %s\n", TARGET_IP);printf("目标端口: %d\n", TARGET_PORT);printf("总发包数: %d\n", SEND_COUNT);printf("成功发包数: %d\n", success);printf("发包数据: %s\n", send_buf);printf("耗时: %d ms\n", time_used);printf("发包速率: %d pps\n", time_used > 0 ? (success * 1000 / time_used) : 0);close(sockfd);return 0;
}

dpdk_udp_send_tool.c

#include <rte_eal.h>
#include <rte_ethdev.h>
#include <rte_mbuf.h>
#include <stdio.h>
#include <arpa/inet.h>
#include <getopt.h>
#include <unistd.h>
#include <sys/time.h>
#include <rte_timer.h>#define ENABLE_SEND 1
#define ENABLE_ARP 1
#define LOG_ENABLE 0#define NUM_MBUFS (4096 - 1)
#define BURST_SIZE 1000                          // 每批次发送的数据包数量
static const char *send_data = "Hello, World!"; // 统一发包数据#if ENABLE_SEND
static uint32_t gSrcIp;
static uint32_t gDstIp;
static uint8_t gSrcMac[RTE_ETHER_ADDR_LEN];
static uint8_t gDstMac[RTE_ETHER_ADDR_LEN];
static uint16_t gSrcPort;
static uint16_t gDstPort;
#endifstatic const struct rte_eth_conf port_conf_default = {.rxmode = {.max_rx_pkt_len = RTE_ETHER_MAX_LEN}};// 全局端口ID
int gDpdkPortId = 0;// 计算时间差(毫秒)
#define TIME_SUB_MS(tv1, tv2) ((tv1.tv_sec - tv2.tv_sec) * 1000 + (tv1.tv_usec - tv2.tv_usec) / 1000)static void ng_init_port(struct rte_mempool *mbuf_pool)
{uint16_t nb_sys_ports = rte_eth_dev_count_avail();if (nb_sys_ports == 0){rte_exit(EXIT_FAILURE, "No Supported eth found\n");}struct rte_eth_dev_info dev_info;rte_eth_dev_info_get(gDpdkPortId, &dev_info);const int num_rx_queues = 1;const int num_tx_queues = 1;struct rte_eth_conf port_conf = port_conf_default;rte_eth_dev_configure(gDpdkPortId, num_rx_queues, num_tx_queues, &port_conf);if (rte_eth_rx_queue_setup(gDpdkPortId, 0, 1024,rte_eth_dev_socket_id(gDpdkPortId), NULL, mbuf_pool) < 0){rte_exit(EXIT_FAILURE, "Could not setup RX queue\n");}#if ENABLE_SENDstruct rte_eth_txconf txq_conf = dev_info.default_txconf;txq_conf.offloads = port_conf.rxmode.offloads;if (rte_eth_tx_queue_setup(gDpdkPortId, 0, 1024,rte_eth_dev_socket_id(gDpdkPortId), &txq_conf) < 0){rte_exit(EXIT_FAILURE, "Could not setup TX queue\n");}
#endifif (rte_eth_dev_start(gDpdkPortId) < 0){rte_exit(EXIT_FAILURE, "Could not start\n");}
}static int ng_encode_tcp_pkt(uint8_t *msg, uint16_t total_len)
{struct rte_ether_hdr *eth = (struct rte_ether_hdr *)msg;rte_memcpy(eth->s_addr.addr_bytes, gSrcMac, RTE_ETHER_ADDR_LEN);rte_memcpy(eth->d_addr.addr_bytes, gDstMac, RTE_ETHER_ADDR_LEN);eth->ether_type = htons(RTE_ETHER_TYPE_IPV4);struct rte_ipv4_hdr *ip = (struct rte_ipv4_hdr *)(msg + sizeof(struct rte_ether_hdr));ip->version_ihl = 0x45;ip->type_of_service = 0;ip->total_length = htons(total_len - sizeof(struct rte_ether_hdr));ip->packet_id = 0;ip->fragment_offset = 0;ip->time_to_live = 64;ip->next_proto_id = IPPROTO_TCP;ip->src_addr = gSrcIp;ip->dst_addr = gDstIp;ip->hdr_checksum = rte_ipv4_cksum(ip);struct rte_tcp_hdr *tcp = (struct rte_tcp_hdr *)(msg + sizeof(struct rte_ether_hdr) + sizeof(struct rte_ipv4_hdr));tcp->src_port = htons(gSrcPort);tcp->dst_port = htons(gDstPort);tcp->tcp_flags = 1 << 1; // SYN标志tcp->data_off = 0x50;tcp->rx_win = htons(65535);tcp->sent_seq = htonl(12345);tcp->recv_ack = 0x0;tcp->cksum = 0;tcp->cksum = rte_ipv4_udptcp_cksum(ip, tcp);struct in_addr addr;addr.s_addr = gSrcIp;printf(" --> tcp src: %s:%d, ", inet_ntoa(addr), gSrcPort);addr.s_addr = gDstIp;printf("dst: %s:%d, %s\n", inet_ntoa(addr), gDstPort, send_data);return 0;
}static struct rte_mbuf *ng_tcp_send(struct rte_mempool *mbuf_pool)
{const unsigned total_len = sizeof(struct rte_ether_hdr) + sizeof(struct rte_ipv4_hdr) + sizeof(struct rte_tcp_hdr);struct rte_mbuf *mbuf = rte_pktmbuf_alloc(mbuf_pool);if (!mbuf){rte_exit(EXIT_FAILURE, "rte_pktmbuf_alloc\n");}mbuf->pkt_len = total_len;mbuf->data_len = total_len;uint8_t *pktdata = rte_pktmbuf_mtod(mbuf, uint8_t *);ng_encode_tcp_pkt(pktdata, total_len);return mbuf;
}static int ng_encode_udp_pkt(uint8_t *msg, uint16_t total_len)
{struct rte_ether_hdr *eth = (struct rte_ether_hdr *)msg;rte_memcpy(eth->s_addr.addr_bytes, gSrcMac, RTE_ETHER_ADDR_LEN);rte_memcpy(eth->d_addr.addr_bytes, gDstMac, RTE_ETHER_ADDR_LEN);eth->ether_type = htons(RTE_ETHER_TYPE_IPV4);struct rte_ipv4_hdr *ip = (struct rte_ipv4_hdr *)(msg + sizeof(struct rte_ether_hdr));ip->version_ihl = 0x45;ip->type_of_service = 0;ip->total_length = htons(total_len - sizeof(struct rte_ether_hdr));ip->packet_id = 0;ip->fragment_offset = 0;ip->time_to_live = 64;ip->next_proto_id = IPPROTO_UDP;ip->src_addr = gSrcIp;ip->dst_addr = gDstIp;ip->hdr_checksum = rte_ipv4_cksum(ip);struct rte_udp_hdr *udp = (struct rte_udp_hdr *)(msg + sizeof(struct rte_ether_hdr) + sizeof(struct rte_ipv4_hdr));udp->src_port = htons(gSrcPort);udp->dst_port = htons(gDstPort);uint16_t udplen = total_len - sizeof(struct rte_ether_hdr) - sizeof(struct rte_ipv4_hdr);udp->dgram_len = htons(udplen);rte_memcpy((uint8_t *)(udp + 1), send_data, strlen(send_data));udp->dgram_cksum = 0;udp->dgram_cksum = rte_ipv4_udptcp_cksum(ip, udp);#if LOG_ENABLEstruct in_addr addr;addr.s_addr = gSrcIp;printf("--> udp  src: %s:%d, ", inet_ntoa(addr), gSrcPort);addr.s_addr = gDstIp;printf("dst: %s:%d --> %s\n", inet_ntoa(addr), gDstPort, send_data);
#endifreturn 0;
}static struct rte_mbuf *ng_udp_send(struct rte_mempool *mbuf_pool)
{const unsigned total_len = sizeof(struct rte_ether_hdr) + sizeof(struct rte_ipv4_hdr) + sizeof(struct rte_udp_hdr) + strlen(send_data);struct rte_mbuf *mbuf = rte_pktmbuf_alloc(mbuf_pool);if (!mbuf){rte_exit(EXIT_FAILURE, "rte_pktmbuf_alloc\n");}mbuf->pkt_len = total_len;mbuf->data_len = total_len;uint8_t *pktdata = rte_pktmbuf_mtod(mbuf, uint8_t *);ng_encode_udp_pkt(pktdata, total_len);return mbuf;
}static struct option args[] = {{"sip", required_argument, 0, 's'},{"sport", required_argument, 0, 'p'},{"dmac", required_argument, 0, 'm'},{"dip", required_argument, 0, 'i'},{"dport", required_argument, 0, 'd'},{"count", required_argument, 0, 'c'},{"udp", no_argument, 0, 'u'},{0, 0, 0, 0}};int main(int argc, char *argv[])
{int ret = rte_eal_init(argc, argv);if (ret < 0){rte_exit(EXIT_FAILURE, "Error with EAL init\n");}argc -= ret;argv += ret;char opt;int flag = 0; // 0: TCP, 1: UDPint count = 1;while ((opt = getopt_long(argc, argv, "s:p:m:i:d:c:u:?", args, NULL)) != -1){switch (opt){case 's':inet_pton(AF_INET, optarg, &gSrcIp);break;case 'p':gSrcPort = atoi(optarg);break;case 'm':sscanf(optarg, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx", &gDstMac[0], &gDstMac[1], &gDstMac[2],&gDstMac[3], &gDstMac[4], &gDstMac[5]);break;case 'i':inet_pton(AF_INET, optarg, &gDstIp);break;case 'd':gDstPort = atoi(optarg);break;case 'u':flag = 1;break;case 'c':count = atoi(optarg);count = count > 1 ? count : 1;break;case '?':printf("Invalid option\n");return -1;default:break;}}struct rte_mempool *mbuf_pool = rte_pktmbuf_pool_create("mbuf pool", NUM_MBUFS,0, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());if (mbuf_pool == NULL){rte_exit(EXIT_FAILURE, "Could not create mbuf pool\n");}ng_init_port(mbuf_pool);rte_eth_macaddr_get(gDpdkPortId, (struct rte_ether_addr *)gSrcMac); // 获取本地MAC// 记录两种计时方式的开始时间uint64_t start_tsc = rte_get_tsc_cycles();struct timeval tv_begin;gettimeofday(&tv_begin, NULL);// 分段发送逻辑:每次发送BURST_SIZE个,直到完成总数量int sent = 0;int failed = 0;while (sent < count){struct rte_mbuf *bursts[BURST_SIZE];int batch_count = 0;// 批量构造BURST_SIZE个数据包for (batch_count = 0; batch_count < BURST_SIZE && sent < count; batch_count++){struct rte_mbuf *mbuf = NULL;//printf("Batch %d, Packet %d\n", (sent / BURST_SIZE) + 1, batch_count + 1);if (flag){mbuf = ng_udp_send(mbuf_pool);}else{// 循环端口号,避免溢出gSrcPort = (gSrcPort + sent % 10000) % 65535;mbuf = ng_tcp_send(mbuf_pool);}bursts[batch_count] = mbuf;sent++;}// 批量发送uint16_t txed = rte_eth_tx_burst(gDpdkPortId, 0, bursts, batch_count);//printf("Batch %d: Sent %d, Failed %d\n", (sent / BURST_SIZE), txed, batch_count - txed);failed += (batch_count - txed);// 释放未发送成功的数据包for (int j = txed; j < batch_count; j++){rte_pktmbuf_free(bursts[j]);sent--; // 回退计数,保证总发送量准确}}// 记录两种计时方式的结束时间uint64_t end_tsc = rte_get_tsc_cycles();struct timeval tv_end;gettimeofday(&tv_end, NULL);// 计算原有微秒级计时(毫秒显示)int time_used_ms = TIME_SUB_MS(tv_end, tv_begin);int success = count - failed;// 计算TSC纳秒级计时uint64_t tsc_cycles = end_tsc - start_tsc;double tsc_hz = rte_get_tsc_hz();double time_used_ns = (double)tsc_cycles / tsc_hz * 1e9; // 转换为纳秒int time_used_us = (int)(time_used_ns / 1000);           // 转换为微秒// 格式化目的MACchar dst_mac_str[20];snprintf(dst_mac_str, sizeof(dst_mac_str), "%02x:%02x:%02x:%02x:%02x:%02x",gDstMac[0], gDstMac[1], gDstMac[2],gDstMac[3], gDstMac[4], gDstMac[5]);// 本地IP转换:网络字节序 -> 主机字节序 -> 点分十进制(修正字节序)struct in_addr src_addr;src_addr.s_addr = gSrcIp;if (inet_ntoa(src_addr) == NULL){printf("本地IP格式无效\n");}else{printf("本地IP: %s\n", inet_ntoa(src_addr));}// 目的IP转换:网络字节序 -> 主机字节序 -> 点分十进制(修正字节序)struct in_addr dst_addr;dst_addr.s_addr = gDstIp;if (inet_ntoa(dst_addr) == NULL){printf("目的IP格式无效\n");}else{printf("目的IP: %s\n", inet_ntoa(dst_addr));}printf("目的端口: %d\n", gDstPort);printf("总发包数: %d\n", count);printf("成功发包数: %d\n", success);printf("失败发包数: %d\n", failed);printf("发包数据: %s\n", send_data);printf("每批次推送到tx队列的包数量: %d\n", BURST_SIZE);printf("===== 计时对比 =====\n");printf("原有微秒级计时(毫秒): %d ms\n", time_used_ms);printf("TSC纳秒级计时(微秒): %d us\n", time_used_us);printf("TSC纳秒级计时(纳秒): %.2f ns\n", time_used_ns);printf("===== 发包速率对比 =====\n");// 修正后的发包速率计算printf("原有方式测的发包速率: %d pps\n", time_used_ms > 0 ? (success * 1000 / time_used_ms) : 0);// 显式转换为uint64_t,避免溢出uint64_t tsc_pps = time_used_us > 0 ? ((uint64_t)success * 1000000) / time_used_us : 0;printf("TSC方式测的发包速率: %lu pps\n", tsc_pps);//printf("原有方式发包速率: %d pps\n", time_used_ms > 0 ? (success * 1000 / time_used_ms) : 0);//printf("TSC方式发包速率: %d pps\n", time_used_us > 0 ? (success * 1000000 / time_used_us) : 0);return 0;
}

http://www.dtcms.com/a/582132.html

相关文章:

  • 什么是网站模板wordpress主题 小工具
  • 本原多项式产生m序列的原理
  • 【软件安全】C语言特性 (C Language Characteristics)
  • seo网站有优化培训班吗一个网站开发环境是什么
  • 廊坊网站建设多少钱app试玩网站制作
  • Spring Cloud Gateway 路由与过滤器机制
  • JUC篇——核心、进程、线程
  • 守护文化遗产:档案馆空气质量监控系统未来发展与档案保护
  • Dockerfile镜像构建
  • 开发鸿蒙应用需要哪些工具和框架
  • 网站网络投票建设模板做常识的网站
  • 咨询网站源码大连公司名称大全
  • 时序数据库系列(五):InfluxDB聚合函数与数据分析
  • 工具篇PL-Sql使用
  • 【开源简历解析】SmartResume 0.6B模型实现96%准确率
  • 做的网站显示图片很慢在线视频网站开发成本
  • 【jmeter】-安装-单机安装部署(Windows和Linux)
  • Vertex AI 服务账号 与 One Hub搭配使用
  • 企业级AI知识库新纪元:如何用开源力量重塑知识管理?
  • 网站栏目划分做网站建设公司企业
  • 3.3、Python-字典
  • 无障碍网站建设的意义wordpress 开源
  • IDEA 开发工具常用插件整理
  • Spark-3.5.7文档4 - Structured Streaming 编程指南
  • 汽车OTA中的证书和证书链
  • 玩转Rust高级应用 怎么理解在标准库中,有一个std::intrinsics模块,它里面包含了一系列的编译器内置函数
  • fixedbug:Idea 项目启动Command line is too long
  • 乌兰察布网站制作互联网行业属于什么行业
  • 破解“用工难”!福欣精密借力金属3D打印重塑生产效率
  • 【剑斩OFFER】算法的暴力美学——二分查找