当前位置: 首页 > news >正文

救回多年未用kubeadm搭建的kubernetes集群

背景:

kubeadm方式搭建的kubernetes集群,多年未用,现启用发现集群崩溃,报故障
高级runtime用的docker

错误信息:couldn't get current server API group list: Get "https://192.168.121.141:6443/api?timeout=32s": dial tcp 192.168.121.141:6443: connect: connection refused

在这里插入图片描述

处理

1、先检查runtime daemon程序是否正常

在这里插入图片描述

2、查询容器运行是否正常

在这里插入图片描述
发现容器都是崩溃状态,一键重启后查看api-server运行状况,发现还是运行失败

3、排查集群kubeadm证书过期导致api-server启动失败的情况

[root@master manifests]# openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text |grep NotNot Before: Apr 16 05:27:41 2023 GMTNot After : Apr 15 05:27:41 2024 GMT

发现时间已经过期一整年,优先恢复下证书,将过期时间进行修改

[root@master ~]# vim update-kubeadm-cert.sh
[root@master ~]# chmod +x update-kubeadm-cert.sh
[root@master ~]# ./update-kubeadm-cert.sh all
[2025-07-10T19:09:10.178353302+0800]: INFO: backup /etc/kubernetes to /etc/kubernetes.old-20250710
Signature ok
subject=/CN=etcd-server
Getting CA Private Key
[2025-07-10T19:09:10.203823693+0800]: INFO: generated /etc/kubernetes/pki/etcd/server.crt
Signature ok
subject=/CN=etcd-peer
Getting CA Private Key
[2025-07-10T19:09:10.233991908+0800]: INFO: generated /etc/kubernetes/pki/etcd/peer.crt
Signature ok
subject=/O=system:masters/CN=kube-etcd-healthcheck-client
Getting CA Private Key
[2025-07-10T19:09:10.252579614+0800]: INFO: generated /etc/kubernetes/pki/etcd/healthcheck-client.crt
Signature ok
subject=/O=system:masters/CN=kube-apiserver-etcd-client
Getting CA Private Key
[2025-07-10T19:09:10.270662717+0800]: INFO: generated /etc/kubernetes/pki/apiserver-etcd-client.crt
[2025-07-10T19:09:10.361241647+0800]: INFO: restarted etcd
Signature ok
subject=/CN=kube-apiserver
Getting CA Private Key
[2025-07-10T19:09:10.387280029+0800]: INFO: generated /etc/kubernetes/pki/apiserver.crt
Signature ok
subject=/O=system:masters/CN=kube-apiserver-kubelet-client
Getting CA Private Key
[2025-07-10T19:09:10.404954699+0800]: INFO: generated /etc/kubernetes/pki/apiserver-kubelet-client.crt
Signature ok
subject=/CN=system:kube-controller-manager
Getting CA Private Key
[2025-07-10T19:09:10.441468594+0800]: INFO: generated /etc/kubernetes/controller-manager.crt
[2025-07-10T19:09:10.446127295+0800]: INFO: generated new /etc/kubernetes/controller-manager.conf
Signature ok
subject=/CN=system:kube-scheduler
Getting CA Private Key
[2025-07-10T19:09:10.478495262+0800]: INFO: generated /etc/kubernetes/scheduler.crt
[2025-07-10T19:09:10.483909670+0800]: INFO: generated new /etc/kubernetes/scheduler.conf
Signature ok
subject=/O=system:masters/CN=kubernetes-admin
Getting CA Private Key
[2025-07-10T19:09:10.514043835+0800]: INFO: generated /etc/kubernetes/admin.crt
[2025-07-10T19:09:10.519546384+0800]: INFO: generated new /etc/kubernetes/admin.conf
[2025-07-10T19:09:10.526208608+0800]: INFO: copy the admin.conf to ~/.kube/config for kubectl
[2025-07-10T19:09:10.528293082+0800]: WARNING: does not need to update kubelet.conf
Signature ok
subject=/CN=front-proxy-client
Getting CA Private Key
[2025-07-10T19:09:10.545998609+0800]: INFO: generated /etc/kubernetes/pki/front-proxy-client.crt
[2025-07-10T19:09:10.617143451+0800]: INFO: restarted kube-apiserver
[2025-07-10T19:09:10.676738046+0800]: INFO: restarted kube-controller-manager
[2025-07-10T19:09:10.742478810+0800]: INFO: restarted kube-scheduler
[2025-07-10T19:09:10.790700987+0800]: INFO: restarted kubelet
您在 /var/spool/mail/root 中有新邮件
[root@master ~]# openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text |grep NotNot Before: Jul 10 11:09:10 2025 GMTNot After : Jun 16 11:09:10 2125 GMT

脚本内容如下:引用大佬的脚本

#!/bin/bashset -o errexit
set -o pipefail
# set -o xtracelog::err() {printf "[$(date +'%Y-%m-%dT%H:%M:%S.%N%z')]: \033[31mERROR: \033[0m$@\n"
}log::info() {printf "[$(date +'%Y-%m-%dT%H:%M:%S.%N%z')]: \033[32mINFO: \033[0m$@\n"
}log::warning() {printf "[$(date +'%Y-%m-%dT%H:%M:%S.%N%z')]: \033[33mWARNING: \033[0m$@\n"
}check_file() {if [[ ! -r  ${1} ]]; thenlog::err "can not find ${1}"exit 1fi
}# get x509v3 subject alternative name from the old certificate
cert::get_subject_alt_name() {local cert=${1}.crtcheck_file "${cert}"local alt_name=$(openssl x509 -text -noout -in ${cert} | grep -A1 'Alternative' | tail -n1 | sed 's/[[:space:]]*Address//g')printf "${alt_name}\n"
}# get subject from the old certificate
cert::get_subj() {local cert=${1}.crtcheck_file "${cert}"local subj=$(openssl x509 -text -noout -in ${cert}  | grep "Subject:" | sed 's/Subject:/\//g;s/\,/\//;s/[[:space:]]//g')printf "${subj}\n"
}cert::backup_file() {local file=${1}if [[ ! -e ${file}.old-$(date +%Y%m%d) ]]; thencp -rp ${file} ${file}.old-$(date +%Y%m%d)log::info "backup ${file} to ${file}.old-$(date +%Y%m%d)"elselog::warning "does not backup, ${file}.old-$(date +%Y%m%d) already exists"fi
}# generate certificate whit client, server or peer
# Args:
#   $1 (the name of certificate)
#   $2 (the type of certificate, must be one of client, server, peer)
#   $3 (the subject of certificates)
#   $4 (the validity of certificates) (days)
#   $5 (the x509v3 subject alternative name of certificate when the type of certificate is server or peer)
cert::gen_cert() {local cert_name=${1}local cert_type=${2}local subj=${3}local cert_days=${4}local alt_name=${5}local cert=${cert_name}.crtlocal key=${cert_name}.keylocal csr=${cert_name}.csrlocal csr_conf="distinguished_name = dn\n[dn]\n[v3_ext]\nkeyUsage = critical, digitalSignature, keyEncipherment\n"check_file "${key}"check_file "${cert}"# backup certificate when certificate not in ${kubeconf_arr[@]}# kubeconf_arr=("controller-manager.crt" "scheduler.crt" "admin.crt" "kubelet.crt")# if [[ ! "${kubeconf_arr[@]}" =~ "${cert##*/}" ]]; then#   cert::backup_file "${cert}"# ficase "${cert_type}" inclient)openssl req -new  -key ${key} -subj "${subj}" -reqexts v3_ext \-config <(printf "${csr_conf} extendedKeyUsage = clientAuth\n") -out ${csr}openssl x509 -in ${csr} -req -CA ${CA_CERT} -CAkey ${CA_KEY} -CAcreateserial -extensions v3_ext \-extfile <(printf "${csr_conf} extendedKeyUsage = clientAuth\n") -days ${cert_days} -out ${cert}log::info "generated ${cert}";;server)openssl req -new  -key ${key} -subj "${subj}" -reqexts v3_ext \-config <(printf "${csr_conf} extendedKeyUsage = serverAuth\nsubjectAltName = ${alt_name}\n") -out ${csr}openssl x509 -in ${csr} -req -CA ${CA_CERT} -CAkey ${CA_KEY} -CAcreateserial -extensions v3_ext \-extfile <(printf "${csr_conf} extendedKeyUsage = serverAuth\nsubjectAltName = ${alt_name}\n") -days ${cert_days} -out ${cert}log::info "generated ${cert}";;peer)openssl req -new  -key ${key} -subj "${subj}" -reqexts v3_ext \-config <(printf "${csr_conf} extendedKeyUsage = serverAuth, clientAuth\nsubjectAltName = ${alt_name}\n") -out ${csr}openssl x509 -in ${csr} -req -CA ${CA_CERT} -CAkey ${CA_KEY} -CAcreateserial -extensions v3_ext \-extfile <(printf "${csr_conf} extendedKeyUsage = serverAuth, clientAuth\nsubjectAltName = ${alt_name}\n") -days ${cert_days} -out ${cert}log::info "generated ${cert}";;*)log::err "unknow, unsupported etcd certs type: ${cert_type}, supported type: client, server, peer"exit 1esacrm -f ${csr}
}cert::update_kubeconf() {local cert_name=${1}local kubeconf_file=${cert_name}.conflocal cert=${cert_name}.crtlocal key=${cert_name}.key# generate  certificatecheck_file ${kubeconf_file}# get the key from the old kubeconfgrep "client-key-data" ${kubeconf_file} | awk {'print$2'} | base64 -d > ${key}# get the old certificate from the old kubeconfgrep "client-certificate-data" ${kubeconf_file} | awk {'print$2'} | base64 -d > ${cert}# get subject from the old certificatelocal subj=$(cert::get_subj ${cert_name})cert::gen_cert "${cert_name}" "client" "${subj}" "${CAER_DAYS}"# get certificate base64 codelocal cert_base64=$(base64 -w 0 ${cert})# backup kubeconf# cert::backup_file "${kubeconf_file}"# set certificate base64 code to kubeconfsed -i 's/client-certificate-data:.*/client-certificate-data: '${cert_base64}'/g' ${kubeconf_file}log::info "generated new ${kubeconf_file}"rm -f ${cert}rm -f ${key}# set config for kubectlif [[ ${cert_name##*/} == "admin" ]]; thenmkdir -p ~/.kubecp -fp ${kubeconf_file} ~/.kube/configlog::info "copy the admin.conf to ~/.kube/config for kubectl"fi
}cert::update_etcd_cert() {PKI_PATH=${KUBE_PATH}/pki/etcdCA_CERT=${PKI_PATH}/ca.crtCA_KEY=${PKI_PATH}/ca.keycheck_file "${CA_CERT}"check_file "${CA_KEY}"# generate etcd server certificate# /etc/kubernetes/pki/etcd/serverCART_NAME=${PKI_PATH}/serversubject_alt_name=$(cert::get_subject_alt_name ${CART_NAME})cert::gen_cert "${CART_NAME}" "peer" "/CN=etcd-server" "${CAER_DAYS}" "${subject_alt_name}"# generate etcd peer certificate# /etc/kubernetes/pki/etcd/peerCART_NAME=${PKI_PATH}/peersubject_alt_name=$(cert::get_subject_alt_name ${CART_NAME})cert::gen_cert "${CART_NAME}" "peer" "/CN=etcd-peer" "${CAER_DAYS}" "${subject_alt_name}"# generate etcd healthcheck-client certificate# /etc/kubernetes/pki/etcd/healthcheck-clientCART_NAME=${PKI_PATH}/healthcheck-clientcert::gen_cert "${CART_NAME}" "client" "/O=system:masters/CN=kube-etcd-healthcheck-client" "${CAER_DAYS}"# generate apiserver-etcd-client certificate# /etc/kubernetes/pki/apiserver-etcd-clientcheck_file "${CA_CERT}"check_file "${CA_KEY}"PKI_PATH=${KUBE_PATH}/pkiCART_NAME=${PKI_PATH}/apiserver-etcd-clientcert::gen_cert "${CART_NAME}" "client" "/O=system:masters/CN=kube-apiserver-etcd-client" "${CAER_DAYS}"# restart etcddocker ps | awk '/k8s_etcd/{print$1}' | xargs -r -I '{}' docker restart {} || truelog::info "restarted etcd"
}cert::update_master_cert() {PKI_PATH=${KUBE_PATH}/pkiCA_CERT=${PKI_PATH}/ca.crtCA_KEY=${PKI_PATH}/ca.keycheck_file "${CA_CERT}"check_file "${CA_KEY}"# generate apiserver server certificate# /etc/kubernetes/pki/apiserverCART_NAME=${PKI_PATH}/apiserversubject_alt_name=$(cert::get_subject_alt_name ${CART_NAME})cert::gen_cert "${CART_NAME}" "server" "/CN=kube-apiserver" "${CAER_DAYS}" "${subject_alt_name}"# generate apiserver-kubelet-client certificate# /etc/kubernetes/pki/apiserver-kubelet-clientCART_NAME=${PKI_PATH}/apiserver-kubelet-clientcert::gen_cert "${CART_NAME}" "client" "/O=system:masters/CN=kube-apiserver-kubelet-client" "${CAER_DAYS}"# generate kubeconf for controller-manager,scheduler,kubectl and kubelet# /etc/kubernetes/controller-manager,scheduler,admin,kubelet.confcert::update_kubeconf "${KUBE_PATH}/controller-manager"cert::update_kubeconf "${KUBE_PATH}/scheduler"cert::update_kubeconf "${KUBE_PATH}/admin"# check kubelet.conf# https://github.com/kubernetes/kubeadm/issues/1753set +egrep kubelet-client-current.pem /etc/kubernetes/kubelet.conf > /dev/null 2>&1kubelet_cert_auto_update=$?set -eif [[ "$kubelet_cert_auto_update" == "0" ]]; thenlog::warning "does not need to update kubelet.conf"elsecert::update_kubeconf "${KUBE_PATH}/kubelet"fi# generate front-proxy-client certificate# use front-proxy-client caCA_CERT=${PKI_PATH}/front-proxy-ca.crtCA_KEY=${PKI_PATH}/front-proxy-ca.keycheck_file "${CA_CERT}"check_file "${CA_KEY}"CART_NAME=${PKI_PATH}/front-proxy-clientcert::gen_cert "${CART_NAME}" "client" "/CN=front-proxy-client" "${CAER_DAYS}"# restart apiserve, controller-manager, scheduler and kubeletdocker ps | awk '/k8s_kube-apiserver/{print$1}' | xargs -r -I '{}' docker restart {} || truelog::info "restarted kube-apiserver"docker ps | awk '/k8s_kube-controller-manager/{print$1}' | xargs -r -I '{}' docker restart {} || truelog::info "restarted kube-controller-manager"docker ps | awk '/k8s_kube-scheduler/{print$1}' | xargs -r -I '{}' docker restart {} || truelog::info "restarted kube-scheduler"systemctl restart kubeletlog::info "restarted kubelet"
}main() {local node_tpye=$1KUBE_PATH=/etc/kubernetesCAER_DAYS=36500# backup $KUBE_PATH to $KUBE_PATH.old-$(date +%Y%m%d)cert::backup_file "${KUBE_PATH}"case ${node_tpye} inetcd)# update etcd certificatescert::update_etcd_cert;;master)# update master certificates and kubeconfcert::update_master_cert;;all)# update etcd certificatescert::update_etcd_cert# update master certificates and kubeconfcert::update_master_cert;;*)log::err "unknow, unsupported certs type: ${cert_type}, supported type: all, etcd, master"printf "Documentation: https://github.com/yuyicai/update-kube-certexample:'\033[32m./update-kubeadm-cert.sh all\033[0m' update all etcd certificates, master certificates and kubeconf/etc/kubernetes├── admin.conf├── controller-manager.conf├── scheduler.conf├── kubelet.conf└── pki├── apiserver.crt├── apiserver-etcd-client.crt├── apiserver-kubelet-client.crt├── front-proxy-client.crt└── etcd├── healthcheck-client.crt├── peer.crt└── server.crt'\033[32m./update-kubeadm-cert.sh etcd\033[0m' update only etcd certificates/etc/kubernetes└── pki├── apiserver-etcd-client.crt└── etcd├── healthcheck-client.crt├── peer.crt└── server.crt'\033[32m./update-kubeadm-cert.sh master\033[0m' update only master certificates and kubeconf/etc/kubernetes├── admin.conf├── controller-manager.conf├── scheduler.conf├── kubelet.conf└── pki├── apiserver.crt├── apiserver-kubelet-client.crt└── front-proxy-client.crt
"exit 1esac
}main "$@"

4、继续get资源发现新的报错

错误信息

错误信息:couldn't get current server API group list: the server has asked for the client to provide credentials

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
错误日志信息:

错误信息:unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory

在这里插入图片描述

kubelet程序auto-restart状态导致通信异常,报错日志显示配置文件不存在,顺带查一下kubelet证书时间

[root@node1 pki]# openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -enddate
notAfter=Apr 15 05:29:49 2024 GMT

实锤kubelet证书过期导致异常
替换证书处理:

1、主节点新生成证书
[root@master ~]# kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.1", GitCommit:"4c9411232e10168d7b050c49a1b59f6df9d7ea4b", GitTreeState:"clean", BuildDate:"2023-04-14T13:21:19Z", GoVersion:"go1.20.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
error: You must be logged in to the server (the server has asked for the client to provide credentials)
[root@master ~]# mkdir test
您在 /var/spool/mail/root 中有新邮件
[root@master ~]# kubeadm init --kubernetes-version=v1.27.1 phase kubeconfig kubelet --node-name node1 --kubeconfig-dir ./test/
[kubeconfig] Writing "kubelet.conf" kubeconfig file主节点和node节点都进行操作(都可以先查下证书时间,过期的都换)
[root@master ~]# scp test/kubelet.conf node1:/etc/kubernetes/
The authenticity of host 'node1 (192.168.121.142)' can't be established.
ECDSA key fingerprint is SHA256:RW+UCKRCkn+EBytxKm4Y+PA7z8YXjd7EMlUifsvljhI.
ECDSA key fingerprint is MD5:6f:97:64:8b:4c:5a:b0:33:1a:4e:95:ef:e3:a1:75:61.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node1,192.168.121.142' (ECDSA) to the list of known hosts.
root@node1's password: 
kubelet.conf     
[root@master ~]# scp test/kubelet.conf /etc/kubernetes/
[root@master ~]# cd /etc/kubernetes
[root@master kubernetes]# ll
总用量 36
-rw-------  1 root root 5511 710 19:09 admin.conf
-rw-------  1 root root 5551 710 19:09 controller-manager.conf
-rw-------  1 root root 5595 710 19:27 kubelet.conf
drwxr-xr-x. 2 root root  113 416 2023 manifests
drwxr-xr-x  3 root root 4096 710 19:09 pki
-rw-------  1 root root 5499 710 19:09 scheduler.conf
[root@master kubernetes]# systemctl restart kubelet

在这里插入图片描述

5、获取namespace资源异常

错误信息:

Error from server (Forbidden): namespaces is forbidden: User "system:node:node1" cannot list resource "namespaces" in API group "" at the cluster scope
[root@master kubernetes]# kubectl get ns -A
Error from server (Forbidden): namespaces is forbidden: User "system:node:node1" cannot list resource "namespaces" in API group "" at the cluster scope
[root@master ~]# kubectl get rolebinding,clusterrolebinding -A
Error from server (Forbidden): rolebindings.rbac.authorization.k8s.io is forbidden: User "system:node:node1" cannot list resource "rolebindings" in API group "rbac.authorization.k8s.io" at the cluster scope
Error from server (Forbidden): clusterrolebindings.rbac.authorization.k8s.io is forbidden: User "system:node:node1" cannot list resource "clusterrolebindings" in API group "rbac.authorization.k8s.io" at the cluster scope

解决方法:

[root@master kubernetes]# mkdir -p $HOME/.kube
[root@master kubernetes]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp:是否覆盖"/root/.kube/config"? y
[root@master kubernetes]# 
[root@master kubernetes]# 
[root@master kubernetes]# 
[root@master kubernetes]# export KUBECONFIG=/etc/kubernetes/admin.conf

6、校验

在这里插入图片描述

http://www.dtcms.com/a/273371.html

相关文章:

  • 5. isaac sim4.2 教程-Core API-操作机械臂
  • 用黑盒测试与白盒测试,读懂专利审查的 “双重关卡”​​
  • K8S的CNI之calico插件升级至3.30.2
  • 深度学习中的 Seq2Seq 模型与注意力机制
  • 解释sync.WaitGroup的用途和工作原理。在什么情况下应该使用它?
  • 时间显示 蓝桥云课Java
  • Android ViewBinding 使用与封装教程​​
  • Netron的基本使用介绍
  • UNet改进(20):融合通道-空间稀疏注意力的医学图像分割模型
  • 客户频繁问询项目进度,如何提高响应效率
  • Java 中的多线程实现方式
  • Spring AI 系列之八 - MCP Server
  • NFS文件存储及部署论坛(小白的“升级打怪”成长之路)
  • (鱼书)深度学习入门2:手搓感知机
  • PostgreSQL创建新实例并指定目录
  • 下一代防火墙混合模式部署
  • Jupyter介绍
  • MySQL事务实现原理
  • SpringCloud系列 - 分布式锁(八)
  • html页面,当鼠标移开A字标就隐藏颜色框
  • Spring Boot项目中大文件上传的优化策略与实践
  • 华为鸿蒙3.0 4.0完全关闭纯净模式的方法以及临时绕过纯净模式检测的方法
  • 接口(上篇)
  • 基于深度学习的自动调制识别网络(持续更新)
  • 亚洲牧原:活跃行业交流,延伸公益版图,市场拓展再结硕果
  • 布隆过滤器原理
  • 我的世界模组开发——机械动力的渲染(4)
  • java-io流
  • 对象序列化与反序列化
  • 【PyTorch】PyTorch 自动微分与完整手动实现对比