Cilium动手实验室: 精通之旅---26.Cilium Host Firewall
Cilium动手实验室: 精通之旅---26.Cilium Host Firewall
- 1. 环境说明
- 2. Cilium CLI
- 3. 对节点的 SSH 访问
- 3.1 通过 Hubble 观察流量
- 3.2 节点上的 SSH
- 3.3 在 Hubble 中查看流量
- 4. 主机网络请求
- 4.1 Host 身份
- 4.2 制定网络策略
- 4.3 API 服务器访问
- 4.4 默认拒绝规则
- 4.5 Hubble访问
- 4.6 从堡垒访问
- 4.7 小实验
- 5. 最终实验
- 5.1 题目
- 5.2 解题
- 5.2.1 apiserver访问策略
- 5.2.2 默认拒绝规则
- 5.2.3 SSH访问策略
1. 环境说明
LAB环境地址
https://isovalent.com/labs/cilium-host-firewall/
我们正在运行一个 Kind Kubernetes 集群,并在该 Cilium 之上。
当 Kind 集群即将完成启动时,让我们看看它的配置:
root@server:~# yq /etc/kind/${KIND_CONFIG}.yaml
---
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:- role: control-planeextraPortMappings:# localhost.run proxy- containerPort: 32042hostPort: 32042# Hubble relay- containerPort: 31234hostPort: 31234# Hubble UI- containerPort: 31235hostPort: 31235- role: worker- role: worker
networking:disableDefaultCNI: truekubeProxyMode: none
在 nodes
部分中,您可以看到集群由 3 个节点组成,1个kind-control-plane,2个kind-worker
在配置文件的 networking
部分中,默认的 CNI 已被禁用,因此集群在启动时将没有任何 Pod 网络。相反,Cilium 被部署到集群中以提供此功能。
root@server:~# k get nodes
NAME STATUS ROLES AGE VERSION
kind-control-plane NotReady control-plane 80m v1.31.0
kind-worker NotReady <none> 80m v1.31.0
kind-worker2 NotReady <none> 80m v1.31.0
您应该会看到三个节点出现,它们都标记为 NotReady
。这是正常的,因为 CNI 被禁用了,我们将在下一步安装 Cilium。如果您没有看到所有节点,则 worker 节点可能仍在加入集群。重新启动该命令,直到您可以看到列出的所有三个节点。
2. Cilium CLI
为了使用 Cilium Host Firewall,我们需要显式启用它。我们还需要使用 Kube 代理替换 (KPR) 模式,因为这是主机防火墙功能的要求:
cilium install \--version 1.17.1 \--set hostFirewall.enabled=true \--set kubeProxyReplacement=true \--set bpf.monitorAggregation=none
激活 Hubble,以便我们可以获得流的可观测性:
cilium hubble enable
几分钟后,Cilium 应该已安装并启用 Hubble,您可以检查一切是否运行良好:
root@server:~# cilium status --wait/¯¯\/¯¯\__/¯¯\ Cilium: OK\__/¯¯\__/ Operator: OK/¯¯\__/¯¯\ Envoy DaemonSet: OK\__/¯¯\__/ Hubble Relay: OK\__/ ClusterMesh: disabledDaemonSet cilium Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet cilium-envoy Desired: 3, Ready: 3/3, Available: 3/3
Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1
Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium Running: 3cilium-envoy Running: 3cilium-operator Running: 1clustermesh-apiserver hubble-relay Running: 1
Cluster Pods: 4/4 managed by Cilium
Helm chart version: 1.17.1
Image versions cilium quay.io/cilium/cilium:v1.17.1@sha256:8969bfd9c87cbea91e40665f8ebe327268c99d844ca26d7d12165de07f702866: 3cilium-envoy quay.io/cilium/cilium-envoy:v1.31.5-1739264036-958bef243c6c66fcfd73ca319f2eb49fff1eb2ae@sha256:fc708bd36973d306412b2e50c924cd8333de67e0167802c9b48506f9d772f521: 3cilium-operator quay.io/cilium/operator-generic:v1.17.1@sha256:628becaeb3e4742a1c36c4897721092375891b58bae2bfcae48bbf4420aaee97: 1hubble-relay quay.io/cilium/hubble-relay:v1.17.1@sha256:397e8fbb188157f744390a7b272a1dec31234e605bcbe22d8919a166d202a3dc: 1
验证 Host Firewall 功能是否已激活:
root@server:~# cilium config view | grep host-firewall
enable-host-firewall true
3. 对节点的 SSH 访问
3.1 通过 Hubble 观察流量
由于我们在 Cilium 安装中激活了 Hubble,因此我们可以观察 Cilium 正在路由的网络流
在端口 22 上进行筛选:
hubble observe --to-identity 1 --port 22 -f
3.2 节点上的 SSH
Kind 将节点作为 Docker 容器运行,而不是虚拟机。
让我们验证一下是否可以访问每个节点上的端口 22。并键入:
for node in $(docker ps --format '{{.Names}}'); doecho "==== Testing connection to node $node ===="IP=$(docker inspect $node -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}')nc -vz -w2 $IP 22
done
每个连接都应该成功,这表明我们能够使用从主机到集群中的任何节点的 SSH。
3.3 在 Hubble 中查看流量
您应该会看到所有被转发的节点的 TCP/22 请求:
Jun 6 06:08:25.737: 172.18.0.1:36354 (world) <> 172.18.0.4:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:08:25.737: 172.18.0.1:36354 (world) <> 172.18.0.4:22 (host) from-network FORWARDED (TCP Flags: ACK)
Jun 6 06:08:25.737: 172.18.0.1:36354 (world) <> 172.18.0.4:22 (host) from-network FORWARDED (TCP Flags: ACK, FIN)
Jun 6 06:08:25.746: 172.18.0.1:36354 (world) <> 172.18.0.4:22 (host) from-network FORWARDED (TCP Flags: RST)
Jun 6 06:08:25.752: 172.18.0.1:38110 (world) <> 172.18.0.2:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:08:25.752: 172.18.0.1:38110 (world) <> 172.18.0.2:22 (host) from-network FORWARDED (TCP Flags: ACK)
Jun 6 06:08:25.752: 172.18.0.1:38110 (world) <> 172.18.0.2:22 (host) from-network FORWARDED (TCP Flags: ACK, FIN)
Jun 6 06:08:25.761: 172.18.0.1:38110 (world) <> 172.18.0.2:22 (host) from-network FORWARDED (TCP Flags: RST)
Jun 6 06:08:25.767: 172.18.0.1:40904 (world) <> 172.18.0.3:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:08:25.767: 172.18.0.1:40904 (world) <> 172.18.0.3:22 (host) from-network FORWARDED (TCP Flags: ACK)
Jun 6 06:08:25.767: 172.18.0.1:40904 (world) <> 172.18.0.3:22 (host) from-network FORWARDED (TCP Flags: ACK, FIN)
Jun 6 06:08:25.775: 172.18.0.1:40904 (world) <> 172.18.0.3:22 (host) from-network FORWARDED (TCP Flags: RST)
4. 主机网络请求
4.1 Host 身份
我们来检查一下节点的策略实施的当前状态。
为了实现这一点,我们需要识别在给定节点上运行的 Cilium pod。
例如,对于控制平面节点:
root@server:~# kubectl get pods -n kube-system -l k8s-app=cilium
NAME READY STATUS RESTARTS AGE
cilium-8xk9n 1/1 Running 0 11m
cilium-ghrbc 1/1 Running 0 11m
cilium-np55d 1/1 Running 0 11m
这应该列出三个 Cilium 节点。
让我们选择其中的第一个,然后执行它以列出该节点上已知的终端节点:
root@server:~# kubectl exec -it -n kube-system cilium-8xk9n -- cilium endpoint list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS ENFORCEMENT ENFORCEMENT
32 Disabled Disabled 4 reserved:health 10.244.1.45 ready
161 Disabled Disabled 9437 k8s:app.kubernetes.io/name=hubble-relay 10.244.1.173 ready k8s:app.kubernetes.io/part-of=cilium k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=kube-system k8s:io.cilium.k8s.policy.cluster=kind-kind k8s:io.cilium.k8s.policy.serviceaccount=hubble-relay k8s:io.kubernetes.pod.namespace=kube-system k8s:k8s-app=hubble-relay
280 Disabled Disabled 1 reserved:host ready
请注意标签为 reserved:host
且标识为 1
的行。这是本地主机的特殊保留身份:
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
例如,此信息可用于使用 --identity 1
参数过滤 Hubble 输出。
4.2 制定网络策略
为了保护对我们节点的访问,我们希望限制对它们的 SSH 访问。
我们将使用以下规则来保护节点:
- 对于 SSH (
tcp/22
),我们将使用 Control Plane 节点作为访问其他节点的堡垒 - 我们仍然需要访问控制平面上的 Kubernetes API 服务器 (
tcp/6443
) - 节点需要能够在 VXLAN 上相互通信 (
udp/8472
)
默认情况下,其他所有内容都应该被拒绝。
为了实现这些规则,我们将使用 CiliumClusterwideNetworkPolicy
(或 ccnp
)资源。这种类型的网络策略全局应用于整个集群,而不是像标准标准网络策略资源那样局限于单个命名空间。
4.3 API 服务器访问
让我们从 API 服务器访问开始。
让我们创建一个 CiliumClusterwideNetworkPolicy
针对 Control Plane 节点的目标,并允许协议 TCP 的端口 6443 进入。
使用以下命令检查 Control Plane 节点的标签:
root@server:~# kubectl get no kind-control-plane -o yaml | yq .metadata.labels
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: kind-control-plane
kubernetes.io/os: linux
node-role.kubernetes.io/control-plane: ""
node.kubernetes.io/exclude-from-external-load-balancers: ""
该节点具有一个具有空值的不同标签 node-role.kubernetes.io/control-plane
。我们将使用它来定位策略中的节点:
---
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:name: "control-plane-apiserver"
spec:description: "Allow Kubernetes API Server to Control Plane"nodeSelector:matchLabels:node-role.kubernetes.io/control-plane: ""ingress:- toPorts:- ports:- port: "6443"protocol: TCP
应用它:
kubectl apply -f ccnp-control-plane-apiserver.yaml
4.4 默认拒绝规则
现在我们可以制作一个 default-deny 规则,因为我们知道我们不会阻止我们对 API 服务器的访问。使用 编辑器选项卡将以下内容保存到 ccnp-default-deny.yaml
:
---
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:name: "default-deny"
spec:description: "Block all unknown traffic to nodes"nodeSelector: {}ingress:- fromEntities:- cluster
使用 fromEntities: ['cluster']
过滤器仅允许来自集群内部的全局流量。因此,此规则会有效地阻止所有流向节点的流量,除非它来自集群内部。
应用它:
kubectl apply -f ccnp-default-deny.yaml
通过列出 CiliumClusterwideNetworkPolicy
资源来验证您仍然可以访问 API 服务器:
root@server:~# kubectl get ccnp
NAME VALID
control-plane-apiserver True
default-deny True
4.5 Hubble访问
观察 Hubble 日志:
hubble observe --identity 1 --port 22 -f
让我们看看我们现在是否尝试通过 SSH 访问节点。
尝试与所有节点建立 SSH 连接:
root@server:~# for node in $(docker ps --format '{{.Names}}'); doecho "==== Testing connection to node $node ===="IP=$(docker inspect $node -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}')nc -vz -w2 $IP 22
done
==== Testing connection to node kind-control-plane ====
nc: connect to 172.18.0.4 port 22 (tcp) timed out: Operation now in progress
==== Testing connection to node kind-worker2 ====
nc: connect to 172.18.0.2 port 22 (tcp) timed out: Operation now in progress
==== Testing connection to node kind-worker ====
nc: connect to 172.18.0.3 port 22 (tcp) timed out: Operation now in progress
这时我们就能看到丢弃的数据包:
Jun 6 06:17:52.688: 172.18.0.1:39746 (world) <> 172.18.0.4:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:17:52.688: 172.18.0.1:39746 (world) <> 172.18.0.4:22 (host) policy-verdict:none INGRESS DENIED (TCP Flags: SYN)
Jun 6 06:17:52.688: 172.18.0.1:39746 (world) <> 172.18.0.4:22 (host) Policy denied DROPPED (TCP Flags: SYN)
Jun 6 06:17:53.724: 172.18.0.1:39746 (world) <> 172.18.0.4:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:17:53.724: 172.18.0.1:39746 (world) <> 172.18.0.4:22 (host) policy-verdict:none INGRESS DENIED (TCP Flags: SYN)
Jun 6 06:17:53.724: 172.18.0.1:39746 (world) <> 172.18.0.4:22 (host) Policy denied DROPPED (TCP Flags: SYN)
Jun 6 06:17:54.706: 172.18.0.1:50676 (world) <> 172.18.0.2:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:17:54.706: 172.18.0.1:50676 (world) <> 172.18.0.2:22 (host) policy-verdict:none INGRESS DENIED (TCP Flags: SYN)
Jun 6 06:17:54.706: 172.18.0.1:50676 (world) <> 172.18.0.2:22 (host) Policy denied DROPPED (TCP Flags: SYN)
Jun 6 06:17:55.709: 172.18.0.1:50676 (world) <> 172.18.0.2:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:17:55.709: 172.18.0.1:50676 (world) <> 172.18.0.2:22 (host) policy-verdict:none INGRESS DENIED (TCP Flags: SYN)
Jun 6 06:17:55.709: 172.18.0.1:50676 (world) <> 172.18.0.2:22 (host) Policy denied DROPPED (TCP Flags: SYN)
Jun 6 06:17:56.722: 172.18.0.1:35928 (world) <> 172.18.0.3:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:17:56.722: 172.18.0.1:35928 (world) <> 172.18.0.3:22 (host) policy-verdict:none INGRESS DENIED (TCP Flags: SYN)
Jun 6 06:17:56.722: 172.18.0.1:35928 (world) <> 172.18.0.3:22 (host) Policy denied DROPPED (TCP Flags: SYN)
Jun 6 06:17:57.756: 172.18.0.1:35928 (world) <> 172.18.0.3:22 (host) from-network FORWARDED (TCP Flags: SYN)
Jun 6 06:17:57.756: 172.18.0.1:35928 (world) <> 172.18.0.3:22 (host) policy-verdict:none INGRESS DENIED (TCP Flags: SYN)
Jun 6 06:17:57.756: 172.18.0.1:35928 (world) <> 172.18.0.3:22 (host) Policy denied DROPPED (TCP Flags: SYN)
我们看到所有 3 个节点上的数据包都被丢弃到 TCP/22
。
现在让我们实现网络策略,将 Control Plane 节点用作堡垒主机:
创建一个新 CiliumClusterwideNetworkPolicy
的以允许 SSH 连接到 Control Plane 节点。将其保存为 ccnp-control-plane-ssh.yaml
:
---
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:name: "ssh"
spec:description: "SSH access on Control Plane"nodeSelector:matchLabels:node-role.kubernetes.io/control-plane: ""ingress:- toPorts:- ports:- port: "22"protocol: TCP
应用它
kubectl apply -f ccnp-control-plane-ssh.yaml
现在再次测试 SSH 连接:
root@server:~# for node in $(docker ps --format '{{.Names}}'); doecho "==== Testing connection to node $node ===="IP=$(docker inspect $node -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}')nc -vz -w2 $IP 22
done
==== Testing connection to node kind-control-plane ====
Connection to 172.18.0.4 22 port [tcp/ssh] succeeded!
==== Testing connection to node kind-worker2 ====
nc: connect to 172.18.0.2 port 22 (tcp) timed out: Operation now in progress
==== Testing connection to node kind-worker ====
nc: connect to 172.18.0.3 port 22 (tcp) timed out: Operation now in progress
它应该只在一个节点上工作:控制平面。
再次检查日志,以验证数据包是否已转发到控制平面,但被阻止到其他节点。
4.6 从堡垒访问
要检查的最后一部分是,您仍然可以通过堡垒主机通过 SSH 访问节点。
在控制平面主机上启动一个 shell:
root@server:~# docker exec -ti kind-control-plane bash
root@kind-control-plane:/# for node in $(kubectl get node -o name); doecho "==== Testing connection to node $node ===="IP=$(kubectl get $node -o jsonpath='{.status.addresses[0].address}');nc -vz -w2 $IP 22;
done
==== Testing connection to node node/kind-control-plane ====
Connection to 172.18.0.4 22 port [tcp/ssh] succeeded!
==== Testing connection to node node/kind-worker ====
Connection to 172.18.0.3 22 port [tcp/ssh] succeeded!
==== Testing connection to node node/kind-worker2 ====
Connection to 172.18.0.2 22 port [tcp/ssh] succeeded!
root@kind-control-plane:/#
所有连接都应该成功,因为集群内允许所有流量。
4.7 小实验
× is enabled by default
√ requires to use CiliumClusterwideNetworkPolicy resources
√ requires to run Cilium in Kube Proxy replacement mode
× is incompatible with Hubble
5. 最终实验
5.1 题目
对于此实践考试,所有 CiliumClusterwideNetworkPolicy 资源都已被删除,并且 YAML 清单已被重置。
您需要通过以下方式恢复配置:
- 编辑 3 个清单,填写正确的值
- 将清单应用于集群。小心订单,以免将自己锁在外面!
5.2 解题
5.2.1 apiserver访问策略
获取control-plane的标签
root@server:~# kubectl get no kind-control-plane -o yaml | yq .metadata.labels
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: kind-control-plane
kubernetes.io/os: linux
node-role.kubernetes.io/control-plane: ""
node.kubernetes.io/exclude-from-external-load-balancers: ""
配置apiserver的CiliumClusterwideNetworkPolicy配置文件并应用配置
root@server:~# yq ccnp-control-plane-apiserver.yaml
---
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:name: "control-plane-apiserver"
spec:description: "Allow Kubernetes API Server to Control Plane"nodeSelector:matchLabels:node-role.kubernetes.io/control-plane: ""ingress:- toPorts:- ports:- port: "6443"protocol: TCP
root@server:~# k apply -f ccnp-control-plane-apiserver.yaml
ciliumclusterwidenetworkpolicy.cilium.io/control-plane-apiserver created
测试
root@server:~# for node in $(docker ps --format '{{.Names}}'); doecho "==== Testing connection to node $node ===="IP=$(docker inspect $node -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}')nc -vz -w2 $IP 22
done
==== Testing connection to node kind-control-plane ====
nc: connect to 172.18.0.4 port 22 (tcp) timed out: Operation now in progress
==== Testing connection to node kind-worker2 ====
Connection to 172.18.0.2 22 port [tcp/ssh] succeeded!
==== Testing connection to node kind-worker ====
Connection to 172.18.0.3 22 port [tcp/ssh] succeeded!
5.2.2 默认拒绝规则
配置并应用策略
root@server:~# yq ccnp-default-deny.yaml
---
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:name: "default-deny"
spec:description: "Block all unknown traffic to nodes"nodeSelector: {}ingress:- fromEntities:- cluster
root@server:~# k apply -f ccnp-default-deny.yaml
ciliumclusterwidenetworkpolicy.cilium.io/default-deny created
测试
root@server:~# for node in $(docker ps --format '{{.Names}}'); doecho "==== Testing connection to node $node ===="IP=$(docker inspect $node -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}')nc -vz -w2 $IP 22
done
==== Testing connection to node kind-control-plane ====
nc: connect to 172.18.0.4 port 22 (tcp) timed out: Operation now in progress
==== Testing connection to node kind-worker2 ====
nc: connect to 172.18.0.2 port 22 (tcp) timed out: Operation now in progress
==== Testing connection to node kind-worker ====
nc: connect to 172.18.0.3 port 22 (tcp) timed out: Operation now in progress
5.2.3 SSH访问策略
配置并应用策略
root@server:~# yq ccnp-control-plane-ssh.yaml
---
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:name: "ssh"
spec:description: "SSH access on Control Plane"nodeSelector:matchLabels:node-role.kubernetes.io/control-plane: ""ingress:- toPorts:- ports:- port: "22"protocol: TCP
root@server:~# k apply -f ccnp-control-plane-ssh.yaml
ciliumclusterwidenetworkpolicy.cilium.io/ssh created
测试
root@server:~# for node in $(docker ps --format '{{.Names}}'); doecho "==== Testing connection to node $node ===="IP=$(docker inspect $node -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}')nc -vz -w2 $IP 22
done
==== Testing connection to node kind-control-plane ====
Connection to 172.18.0.4 22 port [tcp/ssh] succeeded!
==== Testing connection to node kind-worker2 ====
nc: connect to 172.18.0.2 port 22 (tcp) timed out: Operation now in progress
==== Testing connection to node kind-worker ====
nc: connect to 172.18.0.3 port 22 (tcp) timed out: Operation now in progress
测试从堡垒机访问
root@server:~# docker exec -ti kind-control-plane bash
root@kind-control-plane:/# for node in $(kubectl get node -o name); doecho "==== Testing connection to node $node ===="IP=$(kubectl get $node -o jsonpath='{.status.addresses[0].address}');nc -vz -w2 $IP 22;
done
==== Testing connection to node node/kind-control-plane ====
Connection to 172.18.0.4 22 port [tcp/ssh] succeeded!
==== Testing connection to node node/kind-worker ====
Connection to 172.18.0.3 22 port [tcp/ssh] succeeded!
==== Testing connection to node node/kind-worker2 ====
Connection to 172.18.0.2 22 port [tcp/ssh] succeeded!
root@kind-control-plane:/#
看上去非常不错,符合我们的预期.
交卷吧!
新徽章GET!