一次HTTP请求的旅程:用strace + tcpdump追踪DNS与TCP
1)工具介绍
strace
:一个十分强大的工具,能够显示进程所调用的所有系统调用(syscalls)。我们将用它观察一个极简 HTTP 客户端如何解析 DNS 并建立 TCP 连接。tcpdump
:监听网络接口的数据包。我们将看到 DNS 的 UDP 请求与响应,以及 TCP 的三次握手、数据传输与连接关闭。- 目标:对
example.com
发起一次简单的请求,从系统调用与数据包两个层面分析背后的全部细节,并且逐个字节地解读捕获的数据包。 - 环境:我用的是AWS EC2上一台迷你的Ubuntu 24.04机器。
2)一个极简的HTTP client
其实我们本可以使用curl, telnet, netcat这些现成可用的客户端来发送请求,但为了尽可能多地去了解计算机,我偏好于写一个极其迷你的C语言客户端。我们可以看到向一个服务器发送请求究竟需要多少代码:
#include <arpa/inet.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>int main() {const int sock = socket(AF_INET, SOCK_STREAM, 0);struct hostent *h = gethostbyname("example.com");struct sockaddr_in addr = { .sin_family = AF_INET, .sin_port = htons(80) };memcpy(&addr.sin_addr, h->h_addr, h->h_length);connect(sock, (struct sockaddr *)&addr, sizeof(addr));const char request[] = "GET / HTTP/1.0\r\nHost: example.com\r\n\r\n";write(sock, request, strlen(request));char buf[4096];while (1) {ssize_t n = read(sock, buf, sizeof(buf));if (n <= 0) break;write(1, buf, n);}close(sock);
}
3)设置两个 tcpdump
监听器
查看本机的网卡信息,能够看到有两个网络接口。
$ ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft foreverinet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000link/ether 0a:e4:e2:49:cb:fd brd ff:ff:ff:ff:ff:ffinet 172.31.0.139/20 metric 100 brd 172.31.15.255 scope global dynamic ens5valid_lft 3286sec preferred_lft 3286secinet6 fe80::8e4:e2ff:fe49:cbfd/64 scope link valid_lft forever preferred_lft forever
接着打开两个terminal
-
监听 DNS(UDP/53):(地址127.0.0.53来自
/etc/resolv.conf
文件,systemd-resolved 监听此地址并向上游 DNS服务器转发查询)sudo tcpdump -n -vvv -X -i lo udp port 53 and host 127.0.0.53
-
监听 HTTP(TCP/80):
sudo tcpdump -n -vvv -X -i ens5 tcp port 80
4)运行 strace
编译先前的客户端程序后运行strace
gcc -o simple_client simple_client.c
strace ./simple_client
我们能够看到有大量的输出结果,原来一个简单的程序也会让计算机kernel费一番力气(辛苦了!)
execve("./simple_client", ["./simple_client"], 0x7ffee1c30590 /* 24 vars */) = 0
brk(NULL) = 0x5f6fd17ac000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9713ce2000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=28323, ...}) = 0
mmap(NULL, 28323, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f9713cdb000
close(3) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\243\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
fstat(3, {st_mode=S_IFREG|0755, st_size=2125328, ...}) = 0
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 2170256, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f9713a00000
mmap(0x7f9713a28000, 1605632, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7f9713a28000
...
...
输出结果的绝大部分是在运行加载器和初始化,我们可以忽略它们,核心输出如下。
4a)DNS 解析(UDP协议)
socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 4
setsockopt(4, SOL_IP, IP_RECVERR, [1], 4) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(53),sin_addr=inet_addr("127.0.0.53")}, 16) = 0
poll([{fd=4, events=POLLOUT}], 1, 0) = 1
sendto(4, "...DNS query for example.com A...", 40, MSG_NOSIGNAL, NULL, 0) = 40
poll([{fd=4, events=POLLIN}], 1, 5000) = 1
ioctl(4, FIONREAD, [136]) = 0
recvfrom(4, "...DNS response with 6 A records...", 1024, 0,{sa_family=AF_INET, sin_port=htons(53),sin_addr=inet_addr("127.0.0.53")}, [28 => 16]) = 136
close(4) = 0
- 进程创建了一个 UDP socket并连接到地址127.0.0.53
- 发送一个
example.com
的DNS A record(A是address缩写) 查询,接收到的结果里包含了6个IPv4 地址。
4b)TCP 连接与 HTTP 请求
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
connect(3, {AF_INET, port=80, addr=23.192.228.80}, 16) = 0
write(3, "GET / HTTP/1.0\r\nHost: example.co"..., 37) = 37
read(3, "HTTP/1.0 200 OK\r\nContent-Type: t"..., 4096) = 778
write(1, "HTTP/1.0 200 OK..."..., 778) = 778
read(3, "", 4096) = 0
close(3) = 0
- 客户端从先前的6个服务器地址中选择其中一个IP 23.192.228.80,调用
connect()
建立了 TCP 连接。 connect()
内部触发内核Kernel进行三次握手,随后通过write()
发送请求,通过read()
接收 HTTP 响应。
5)运行 tcpdump
捕获数据并解读
strace的结果显示计算机内核为我们执行了许多syscalls操作,但是我们没能看到具体是哪些数据在网络上进行了传输,于是我们需要依靠tcpdump来帮助我们捕获数据,好在我们先前已经提前设置好了tcpdump listener, 现在这两个terminal页面里能看到满满当当的结果。
5a)DNS over UDP packets
这两个UDP packet中,显然前者是DNS查询请求而后者是查询结果。tcpdump捕捉了最原始的(16进制)字节bytes, 并将其解读为了易于人类阅读的内容(诸如Wireshark的其他工具也能做到类似的功能)。
09:06:37.822813 lo In IP (tos 0x0, ttl 64, id 59748, offset 0, flags [DF], proto UDP (17), length 68)127.0.0.1.53156 > 127.0.0.53.53: [bad udp cksum 0xfe77 -> 0x1061!] 19637+ [1au] A? example.com. ar: . OPT UDPsize=1200 (40)0x0000: 4500 0044 e964 4000 4011 530e 7f00 0001 E..D.d@.@.S.....0x0010: 7f00 0035 cfa4 0035 0030 fe77 4cb5 0120 ...5...5.0.wL...0x0020: 0001 0000 0000 0001 0765 7861 6d70 6c65 .........example0x0030: 0363 6f6d 0000 0100 0100 0029 04b0 0000 .com.......)....0x0040: 0000 0000 ....09:06:37.823120 lo In IP (tos 0x0, ttl 1, id 19753, offset 0, flags [DF], proto UDP (17), length 164)127.0.0.53.53 > 127.0.0.1.53156: [bad udp cksum 0xfed7 -> 0x29a2!] 19637 q: A? example.com. 6/0/1 example.com. [21s] A 23.192.228.80, example.com. [21s] A 23.215.0.138, example.com. [21s] A 23.220.75.245, example.com. [21s] A 23.215.0.136, example.com. [21s] A 23.192.228.84, example.com. [21s] A 23.220.75.232 ar: . OPT UDPsize=65494 (136)0x0000: 4500 00a4 4d29 4000 0111 2dea 7f00 0035 E...M)@...-....50x0010: 7f00 0001 0035 cfa4 0090 fed7 4cb5 8180 .....5......L...0x0020: 0001 0006 0000 0001 0765 7861 6d70 6c65 .........example0x0030: 0363 6f6d 0000 0100 01c0 0c00 0100 0100 .com............0x0040: 0000 1500 0417 c0e4 50c0 0c00 0100 0100 ........P.......0x0050: 0000 1500 0417 d700 8ac0 0c00 0100 0100 ................0x0060: 0000 1500 0417 dc4b f5c0 0c00 0100 0100 .......K........0x0070: 0000 1500 0417 d700 88c0 0c00 0100 0100 ................0x0080: 0000 1500 0417 c0e4 54c0 0c00 0100 0100 ........T.......0x0090: 0000 1500 0417 dc4b e800 0029 ffd6 0000 .......K...)....0x00a0: 0000 0000 ....
让我们来重点看看第二个packet, 即DNS查询结果,数据的前20 bytes为IP header
4500 00a4 4d29 4000 0111 2dea 7f00 0035 7f00 0001
Field | Bytes | Meaning |
---|---|---|
Version & IHL | 45 | IPv4, 20-byte header |
Type of Service | 00 | Normal |
Total Length | 00a4 → 164 | 20 bytes IP header + 144 bytes payload |
Identification | 4d29 | Fragment ID 0xe9d8 |
Flags & Offset | 4000 | Don’t Fragment |
TTL | 01 | 1 hop (local only — never leaves host) |
Protocol | 11 | UDP |
Checksum | 2dea | IP Header checksum |
Source IP | 7f00 0035 → 127.0.0.53 | |
Destination IP | 7f00 0001 → 127.0.0.1 |
接下来的8 bytes是UDP header
0035 cfa4 0090 fed7
Field | Bytes | Meaning |
---|---|---|
Source Port | 0035 → 53 | DNS server (systemd-resolved) |
Destination Port | cfa4 → 53156 | The client port to reply to |
Length | 0090 → 144 | UDP datagram = 8-byte header + 136-byte payload |
Checksum | fed7 | UDP checksum |
接着看UDP payload, 这个DNS Message Structure包含了下列几个部分
Section | Bytes | Description |
---|---|---|
Header | 12 bytes | Identification, flags, counts |
Question | variable | Query name (QNAME), type, class |
Answer | variable | Present only in responses |
Authority | variable | Optional NS records |
Additional | variable | OPT or other records (EDNS, etc.) |
让我们来逐个查看,首先是Header部分:
4cb5 8180 0001 0006 0000 0001
Field | Bytes | Meaning |
---|---|---|
Transaction ID | 4cb5 | 这个ID和第一个packet中的ID是一致,即一问一答 |
Flags | 8180 | response, recursion available, no error |
QDCOUNT | 0001 | 1 question |
ANCOUNT | 0006 | 6 answers |
NSCOUNT | 0000 | 0 authority |
ARCOUNT | 0001 | 1 additional (OPT) |
接着是Question部分:
0765 7861 6d70 6c65 0363 6f6d 0000 0100 01
DNS query和DNS response在这一部分的数据是完全一致的,都是对应着 example.com
Byte(s) | Meaning |
---|---|
07 | Length of next label = 7 |
65 78 61 6d 70 6c 65 | “example” |
03 | Length of next label = 3 |
63 6f 6d | “com” |
00 | End of QNAME |
0001 | QTYPE = A (host address) |
0001 | QCLASS = IN (Internet) |
后续是DNS解析得到的 example.com
的六个可选地址,可以看到每条记录都遵循着下列格式
c00c 0001 0001 0000 1500 04xx xx xx xx
Field | Meaning |
---|---|
c00c | Name pointer → offset 0x0C (“example.com”) |
0001 | TYPE = A (host address) |
0001 | CLASS = IN (Internet) |
0000 1500 | TTL 大约为90mins |
04 | RDLENGTH = 4 bytes |
17 c0 e4 50 | RDATA = 23.192.228.80 |
另外的五个地址为
23.215.0.138
23.220.75.245
23.215.0.136
23.192.228.84
23.220.75.232
5b)TCP packets
与example.com进行通信的全程总共涉及了十个packet, 包含了我们常说的TCP三次握手和四次挥手,以及中间的三个packet传输了请求和回复。
- SYN (client→server):建立连接请求。
- SYN-ACK (server→client):server确认并同步。
- ACK (client→server):握手完成。
- PSH,ACK (client→server):发送 HTTP 请求。
- ACK (server→client):确认请求。
- PSH,ACK (server→client):返回 HTTP/1.0 200 OK响应。
- ACK (client→server):确认收到响应数据。
- FIN,ACK (server→client):server结束发送。
- FIN,ACK (client→server):client也结束发送。
- ACK (server→client):最终确认,连接关闭。
前三个packet完成了握手,数据如下。Flags中的 S
和 .
分别代表SYN (synchronization) 和ACK (acknowledgement). 我们将会看到,除了第一个packet只有SYN flag以外,剩余的九个packet都会携带ACK, 表示成功接收了对方发送的信息,并且ack的sequence number取值也和对方上条消息的seq取值直接相关。
09:06:37.823498 ens5 Out IP (tos 0x0, ttl 64, id 44460, offset 0, flags [DF], proto TCP (6), length 60)172.31.0.139.50086 > 23.192.228.80.80: Flags [S], cksum 0xa8e9 (incorrect -> 0x47d5), seq 1051058529, win 62727, options [mss 8961,sackOK,TS val 3281296776 ecr 0,nop,wscale 7], length 00x0000: 4500 003c adac 4000 4006 e454 ac1f 008b E..<..@.@..T....0x0010: 17c0 e450 c3a6 0050 3ea5 e161 0000 0000 ...P...P>..a....0x0020: a002 f507 a8e9 0000 0204 2301 0402 080a ..........#.....0x0030: c394 9d88 0000 0000 0103 0307 ............09:06:37.985055 ens5 In IP (tos 0x0, ttl 50, id 0, offset 0, flags [DF], proto TCP (6), length 60)23.192.228.80.80 > 172.31.0.139.50086: Flags [S.], cksum 0xf415 (correct), seq 4242621896, ack 1051058530, win 65160, options [mss 1460,sackOK,TS val 1795013074 ecr 3281296776,nop,wscale 7], length 00x0000: 4500 003c 0000 4000 3206 a001 17c0 e450 E..<..@.2......P0x0010: ac1f 008b 0050 c3a6 fce1 45c8 3ea5 e162 .....P....E.>..b0x0020: a012 fe88 f415 0000 0204 05b4 0402 080a ................0x0030: 6afd b9d2 c394 9d88 0103 0307 j...........09:06:37.985084 ens5 Out IP (tos 0x0, ttl 64, id 44461, offset 0, flags [DF], proto TCP (6), length 52)172.31.0.139.50086 > 23.192.228.80.80: Flags [.], cksum 0xa8e1 (incorrect -> 0x1ede), seq 1, ack 1, win 491, options [nop,nop,TS val 3281296938 ecr 1795013074], length 00x0000: 4500 0034 adad 4000 4006 e45b ac1f 008b E..4..@.@..[....0x0010: 17c0 e450 c3a6 0050 3ea5 e162 fce1 45c9 ...P...P>..b..E.0x0020: 8010 01eb a8e1 0000 0101 080a c394 9e2a ...............*0x0030: 6afd b9d2 j...
中间三个数据传输的packet如下。Flag P
为PSH (push). 注意到第六个packet内容太长,因此我没有完整地粘贴。
09:06:37.985300 ens5 Out IP (tos 0x0, ttl 64, id 44462, offset 0, flags [DF], proto TCP (6), length 89)172.31.0.139.50086 > 23.192.228.80.80: Flags [P.], cksum 0xa906 (incorrect -> 0xd60c), seq 1:38, ack 1, win 491, options [nop,nop,TS val 3281296938 ecr 1795013074], length 37: HTTP, length: 37GET / HTTP/1.0Host: example.com0x0000: 4500 0059 adae 4000 4006 e435 ac1f 008b E..Y..@.@..5....0x0010: 17c0 e450 c3a6 0050 3ea5 e162 fce1 45c9 ...P...P>..b..E.0x0020: 8018 01eb a906 0000 0101 080a c394 9e2a ...............*0x0030: 6afd b9d2 4745 5420 2f20 4854 5450 2f31 j...GET./.HTTP/10x0040: 2e30 0d0a 486f 7374 3a20 6578 616d 706c .0..Host:.exampl0x0050: 652e 636f 6d0d 0a0d 0a e.com....09:06:38.146863 ens5 In IP (tos 0x0, ttl 50, id 48462, offset 0, flags [DF], proto TCP (6), length 52)23.192.228.80.80 > 172.31.0.139.50086: Flags [.], cksum 0x1e05 (correct), seq 1, ack 38, win 509, options [nop,nop,TS val 1795013236 ecr 3281296938], length 00x0000: 4500 0034 bd4e 4000 3206 e2ba 17c0 e450 E..4.N@.2......P0x0010: ac1f 008b 0050 c3a6 fce1 45c9 3ea5 e187 .....P....E.>...0x0020: 8010 01fd 1e05 0000 0101 080a 6afd ba74 ............j..t0x0030: c394 9e2a ...*09:06:38.155752 ens5 In IP (tos 0x0, ttl 50, id 48463, offset 0, flags [DF], proto TCP (6), length 830)23.192.228.80.80 > 172.31.0.139.50086: Flags [P.], cksum 0xbe51 (correct), seq 1:779, ack 38, win 509, options [nop,nop,TS val 1795013245 ecr 3281296938], length 778: HTTP, length: 778HTTP/1.0 200 OKContent-Type: text/html......
最后四个packet则是挥手告别,在这里我们看到了新的flag F
即FIN (finish)
09:06:38.155792 ens5 Out IP (tos 0x0, ttl 64, id 44463, offset 0, flags [DF], proto TCP (6), length 52)172.31.0.139.50086 > 23.192.228.80.80: Flags [.], cksum 0xa8e1 (incorrect -> 0x1a47), seq 38, ack 779, win 510, options [nop,nop,TS val 3281297108 ecr 1795013245], length 00x0000: 4500 0034 adaf 4000 4006 e459 ac1f 008b E..4..@.@..Y....0x0010: 17c0 e450 c3a6 0050 3ea5 e187 fce1 48d3 ...P...P>.....H.0x0020: 8010 01fe a8e1 0000 0101 080a c394 9ed4 ................0x0030: 6afd ba7d j..}09:06:38.157430 ens5 In IP (tos 0x0, ttl 50, id 48464, offset 0, flags [DF], proto TCP (6), length 52)23.192.228.80.80 > 172.31.0.139.50086: Flags [F.], cksum 0x1aef (correct), seq 779, ack 38, win 509, options [nop,nop,TS val 1795013247 ecr 3281296938], length 00x0000: 4500 0034 bd50 4000 3206 e2b8 17c0 e450 E..4.P@.2......P0x0010: ac1f 008b 0050 c3a6 fce1 48d3 3ea5 e187 .....P....H.>...0x0020: 8011 01fd 1aef 0000 0101 080a 6afd ba7f ............j...0x0030: c394 9e2a ...*09:06:38.157642 ens5 Out IP (tos 0x0, ttl 64, id 44464, offset 0, flags [DF], proto TCP (6), length 52)172.31.0.139.50086 > 23.192.228.80.80: Flags [F.], cksum 0xa8e1 (incorrect -> 0x1a41), seq 38, ack 780, win 510, options [nop,nop,TS val 3281297110 ecr 1795013247], length 00x0000: 4500 0034 adb0 4000 4006 e458 ac1f 008b E..4..@.@..X....0x0010: 17c0 e450 c3a6 0050 3ea5 e187 fce1 48d4 ...P...P>.....H.0x0020: 8011 01fe a8e1 0000 0101 080a c394 9ed6 ................0x0030: 6afd ba7f j...09:06:38.319203 ens5 In IP (tos 0x0, ttl 50, id 48465, offset 0, flags [DF], proto TCP (6), length 52)23.192.228.80.80 > 172.31.0.139.50086: Flags [.], cksum 0x19a1 (correct), seq 780, ack 39, win 509, options [nop,nop,TS val 1795013408 ecr 3281297110], length 00x0000: 4500 0034 bd51 4000 3206 e2b7 17c0 e450 E..4.Q@.2......P0x0010: ac1f 008b 0050 c3a6 fce1 48d4 3ea5 e188 .....P....H.>...0x0020: 8010 01fd 19a1 0000 0101 080a 6afd bb20 ............j...0x0030: c394 9ed6 ....
类似于上面的DNS packets解析,我们现在也来拆解一下TCP所涉及的原始字节,这十个packet的前20 bytes同样是IP header, 并且和之前DNS packets十分相似
4500 003c adac 4000 4006 e454 ac1f 008b 17c0 e450
Field | Bytes | Meaning |
---|---|---|
Version & IHL | 45 | Version 4, header = 5×4 = 20 bytes |
TOS | 00 | Type of Service = 0 |
Total Length | 003c → 60 | 60 bytes total (20 IP + 40 TCP) |
Identification | adac | Packet ID (used for fragmentation) |
Flags & Fragment Offset | 4000 | DF (Don’t Fragment) set |
TTL | 40 | 64 hops |
Protocol | 06 | TCP |
Header Checksum | e454 | IP checksum |
Source IP | ac1f 008b → 172.31.0.139 | 我的AWS EC2香港IP |
Destination IP | 17c0 e450 → 23.192.228.80 | example.com的IP, 来自于DNS查询结果 |
而TCP header (40 bytes)则是要比UDP header (8 bytes)庞大许多:
Field | Bytes | Meaning |
---|---|---|
Src Port | c3a6 → 50086 | Client ephemeral port |
Dst Port | 0050 → 80 | HTTP |
Seq Num | 3ea5e161 → 1051058529 | Initial sequence |
Ack Num | 00000000 | Not yet used (SYN) |
Offset/Flags | a002 | offset=10 (40 bytes), flags=SYN |
Window Size | f507 → 62727 | |
Checksum | a8e9 | |
Urgent Ptr | 0000 | Usually zero |
后续的TCP payload容易理解,握手和挥手的packets里实际上不包含任何应用数据,纯粹是围绕着建立/关闭TCP连接。用户请求在第五个packet中,由于HTTP不像HTTPS对信息进行加密,我们可以直接看到请求文本 GET / HTTP/1.0\r\nHost: example.com\r\n\r\n
所对应的ASCII hex字符。
6)总结
我们都听过计算机网络的分层模型以及TCP的握手与挥手,我个人偏好的学习方式便是这样运行具体例子,然后去认真地阅读Kernel执行了哪些syscall操作,每次与外界通信时都发送/接收了哪些信息,逐个字节地去阅读数据,以此帮助理解网络分层封装的本质。