k8s中应用容器随redis集群自动重启
背景:
业务系统需要使用redis集群做缓存,但是redis集群不稳定会出现宕机问题,宕机后redis自动重启,但是业务系统不能自动重连redis。
改进方案:
当redis集群故障服务不能使用时,业务系统也要随redis一起重启。在业务系统的容器中添加redis服务探查程序。更好的方式是业务系统支持redis的服务的自动重连
apiVersion: apps/v1
kind: Deployment
metadata:name: business-app-with-monitor
spec:replicas: 3template:metadata:labels:app: business-appspec:initContainers:- name: redis-preflight-checkimage: busybox:1.35command: - /bin/sh- -c- |# 简化的启动前检查echo "执行 Redis 集群启动前检查..."for i in 0 1 2; doif ! nc -z -w 5 redis-$i.redis-headless 6379; thenecho "❌ redis-$i 不可用"exit 1fiecho "✅ redis-$i 连接成功"doneecho "所有 Redis 节点就绪"containers:# 主业务容器- name: business-appimage: your-business-app:latestports:- containerPort: 8080env:- name: REDIS_CLUSTER_NODESvalue: "redis-0.redis-headless:6379,redis-1.redis-headless:6379,redis-2.redis-headless:6379"livenessProbe:exec:command:- /bin/sh- -c- |# 检查应用是否健康,同时检查 Redis 连接if ! curl -f http://localhost:8080/health; thenexit 1fi# 检查 Redis 集群状态if ! echo "CLUSTER INFO" | nc -w 3 redis-0.redis-headless 6379 | grep -q "cluster_state:ok"; thenecho "Redis 集群异常,触发重启"exit 1fiinitialDelaySeconds: 30periodSeconds: 15timeoutSeconds: 10failureThreshold: 2# Redis 监控 Sidecar- name: redis-monitorimage: busybox:1.35command:- /bin/sh- -c- |echo "启动 Redis 集群监控..."while true; doecho "$(date): 检查 Redis 集群状态..."# 检查集群健康状态cluster_healthy=falsefor node in redis-0.redis-headless:6379 redis-1.redis-headless:6379 redis-2.redis-headless:6379; dohost=${node%:*}port=${node#*:}if echo "CLUSTER INFO" | nc -w 5 $host $port 2>/dev/null | grep -q "cluster_state:ok"; thencluster_healthy=truebreakfidoneif [ "$cluster_healthy" = "false" ]; thenecho "❌ $(date): Redis 集群状态异常,触发业务容器重启"# 向业务容器发送 SIGTERM 信号,触发重启kill -TERM 1sleep 10elseecho "✅ $(date): Redis 集群状态正常"fisleep 30doneresources:requests:memory: "32Mi"cpu: "10m"