k8s-容器探针
k8s-容器探针
- 检查机制
- 一、存活探针(livenessProbe)
- 二、就绪探针(readinessProbe)
- 三、启动探针(startupProbe)
官方文档:https://kubernetes.io/zh-cn/docs/concepts/workloads/pods/pod-lifecycle/#container-probes
检查机制
使用探针来检查容器有四种不同的方法。 每个探针都必须准确定义为这四种机制中的一种:
exec
在容器内执行指定命令。如果命令退出时返回码为 0 则认为诊断成功grpc
使用 gRPC 执行一个远程过程调用。 目标应该实现 gRPC 健康检查。 如果响应的状态是 “SERVING”,则认为诊断成功httpGet
对容器的 IP 地址上指定端口和路径执行 HTTP GET 请求。如果响应的状态码大于等于 200 且小于 400,则诊断被认为是成功的tcpSocket
对容器的 IP 地址上的指定端口执行 TCP 检查。如果端口打开,则诊断被认为是成功的。 如果远程系统(容器)在打开连接后立即将其关闭,这算作是健康的
探测结果
- Success(成功)
容器通过了诊断 - Failure(失败)
容器未通过诊断 - Unknown(未知)
诊断失败,因此不会采取任何行动
一、存活探针(livenessProbe)
指示容器是否正在运行。如果存活态探测失败,则 kubelet 会杀死容器, 并且容器将根据其重启策略决定未来。如果容器不提供存活探针, 则默认状态为 Success
[root@k8s-1 probe]# vim liveness.yaml
apiVersion: v1
kind: Pod
metadata:labels:test: livenessname: liveness-exec
spec:containers:- name: livenessimage: busybox:1.28args:- /bin/sh- -c- touch /tmp/healthy; sleep 30; rm -f /tmp/healthy; sleep 600livenessProbe:exec:command:- cat- /tmp/healthyinitialDelaySeconds: 5periodSeconds: 5
执行过程
[root@k8s-1 probe]# kubectl apply -f liveness.yaml
pod/liveness-exec created
[root@k8s-1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 0 14s 10.224.200.208 k8s-2 <none> <none>
[root@k8s-1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 0 36s 10.224.200.208 k8s-2 <none> <none>
[root@k8s-1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 0 48s 10.224.200.208 k8s-2 <none> <none>[root@k8s-1 probe]# kubectl describe pod liveness-exec
Events:Type Reason Age From Message---- ------ ---- ---- -------Normal Scheduled 66s default-scheduler Successfully assigned default/liveness-exec to k8s-2Normal Pulled 65s kubelet Container image "busybox:1.28" already present on machineNormal Created 65s kubelet Created container livenessNormal Started 65s kubelet Started container livenessWarning Unhealthy 21s (x3 over 31s) kubelet Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directoryNormal Killing 21s kubelet Container liveness failed liveness probe, will be restarted
# 按重启策略杀死容器,容器被重新创建,RESTARTS计数器变为 1
[root@k8s-1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 1 (16s ago) 91s 10.224.200.208 k8s-2 <none> <none>
当探针发现pod没有运行的时候,kubelet会根据pod的重启策略,重启pod
二、就绪探针(readinessProbe)
指示容器是否准备好为请求提供服务。如果就绪态探测失败, EndpointSlice 控制器将从与该 Pod 匹配的所有 Service 的 EndpointSlice 中删除该 Pod 的 IP 地址。 初始延迟之前的就绪态的状态值默认为 Failure。 如果容器不提供就绪态探针,则默认状态为 Success
[root@k8s-1 probe]# vim nginx-liveness.yaml
apiVersion: v1
kind: Namespace
metadata:name: sc2
---
apiVersion: v1
kind: Pod
metadata:name: sc-nginx-redisnamespace: sc2
spec: containers: - name: sc-nginximage: nginx:latestimagePullPolicy: IfNotPresentports:- containerPort: 80livenessProbe:httpGet:path: /port: 8080 # 每 3 秒通过 httpGet 请求检查 8080 端口的根路径initialDelaySeconds: 3periodSeconds: 3readinessProbe:tcpSocket:port: 8090 # 每 10 秒检查 8090 端口的 TCP 连接initialDelaySeconds: 5periodSeconds: 10- name: sc-redisimage: redis:latestimagePullPolicy: IfNotPresentports:- containerPort: 6379nodeName: k8s-3restartPolicy: Always
探针持续失败
[root@k8s-1 probe]# kubectl apply -f nginx-liveness.yaml
namespace/sc2 created
pod/sc-nginx-redis created
[root@k8s-1 probe]# kubectl get pod -n sc2
NAME READY STATUS RESTARTS AGE
sc-nginx-redis 1/2 Running 0 20s
[root@k8s-1 probe]# kubectl get pod -n sc2
NAME READY STATUS RESTARTS AGE
sc-nginx-redis 1/2 Running 1 (6s ago) 27s
[root@k8s-1 probe]# kubectl get pod -n sc2
NAME READY STATUS RESTARTS AGE
sc-nginx-redis 1/2 CrashLoopBackOff 2 (1s ago) 40s
[root@k8s-1 probe]# kubectl get pod -n sc2
NAME READY STATUS RESTARTS AGE
sc-nginx-redis 1/2 Running 3 (18s ago) 57s[root@k8s-1 probe]# kubectl describe pod sc-nginx-redis -n sc2|grep -A 12 Event
Events:Type Reason Age From Message---- ------ ---- ---- -------Normal Pulling 3m15s kubelet Pulling image "redis:latest"Normal Pulled 3m3s kubelet Successfully pulled image "redis:latest" in 12.406609148s (12.406629654s including waiting)Normal Created 3m3s kubelet Created container sc-redisNormal Started 3m3s kubelet Started container sc-redisNormal Created 2m55s (x2 over 3m16s) kubelet Created container sc-nginxNormal Started 2m55s (x2 over 3m15s) kubelet Started container sc-nginxNormal Pulled 2m46s (x3 over 3m16s) kubelet Container image "nginx:latest" already present on machineWarning Unhealthy 2m46s (x6 over 3m3s) kubelet Readiness probe failed: dial tcp 10.224.13.87:8090: connect: connection refusedWarning Unhealthy 2m46s (x6 over 3m1s) kubelet Liveness probe failed: Get "http://10.224.13.87:8080/": dial tcp 10.224.13.87:8080: connect: connection refusedNormal Killing 2m46s (x2 over 2m55s) kubelet Container sc-nginx failed liveness probe, will be restarted
改为80端口探测
存活探针和就绪探针都能成功检测到 nginx 服务,两个容器都正常运行
[root@k8s-1 probe]# vim nginx-liveness.yaml
[root@k8s-1 probe]# kubectl apply -f nginx-liveness.yaml
namespace/sc2 created
pod/sc-nginx-redis created
[root@k8s-1 probe]# kubectl get pod -n sc2
NAME READY STATUS RESTARTS AGE
sc-nginx-redis 1/2 Running 0 5s
[root@k8s-1 probe]# kubectl get pod -n sc2
NAME READY STATUS RESTARTS AGE
sc-nginx-redis 2/2 Running 0 12s
[root@k8s-1 probe]# kubectl describe pod sc-nginx-redis -n sc2|grep -A 12 Event
Events:Type Reason Age From Message---- ------ ---- ---- -------Normal Pulled 20s kubelet Container image "nginx:latest" already present on machineNormal Created 20s kubelet Created container sc-nginxNormal Started 20s kubelet Started container sc-nginxNormal Pulled 20s kubelet Container image "redis:latest" already present on machineNormal Created 20s kubelet Created container sc-redisNormal Started 20s kubelet Started container sc-redis
三、启动探针(startupProbe)
指示容器中的应用是否已经启动。如果提供了启动探针,则所有其他探针都会被禁用,直到此探针成功为止。如果启动探测失败,kubelet 将杀死容器, 而容器依其重启策略进行重启。 如果容器没有提供启动探测,则默认状态为 Success
适用于启动缓慢的应用,避免存活探针在启动过程中误判