CentOS8.3+Kubernetes1.32.5+Docker28.2.2高可用集群二进制部署
一、准备工作
1.1 主机列表
Hostname | Host IP | Docker IP | Role |
---|---|---|---|
k8s31.vm.com | 192.168.26.31 | 10.26.31.1/24 | master&worker、etcd、docker |
k8s32.vm.com | 192.168.26.32 | 10.26.32.1/24 | master&worker、etcd、docker |
k8s33.vm.com | 192.168.26.33 | 10.26.33.1/24 | master&worker、etcd、docker |
~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.26.31 k8s31.vm.com k8s31
192.168.26.32 k8s32.vm.com k8s32
192.168.26.33 k8s33.vm.com k8s33
1.2 软件列表
1.2.1 二进制软件
Linux ISO镜像:CentOS-8.3.2011-x86_64-minimal.iso (https://vault.centos.org/8.3.2011/isos/x86_64/CentOS-8.3.2011-x86_64-minimal.iso)
下载:https://mirrors.aliyun.com/centos-vault/8.3.2011/isos/x86_64/CentOS-8.3.2011-x86_64-minimal.iso
docker 28.2.2 (2025-05-30 12:58:37 77.4 MiB)
https://download.docker.com/linux/static/stable/x86_64/
下载:https://download.docker.com/linux/static/stable/x86_64/docker-28.2.2.tgz
cri-dockerd 0.3.17
https://github.com/Mirantis/cri-dockerd
下载:https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.17/cri-dockerd-0.3.17.amd64.tgz
kubernetes v1.32.5
https://github.com/kubernetes/kubernetes/releases
下载:https://dl.k8s.io/v1.32.5/kubernetes-server-linux-amd64.tar.gz
etcd v3.6.0
https://github.com/etcd-io/etcd/
下载:https://github.com/etcd-io/etcd/releases/download/v3.6.0/etcd-v3.6.0-linux-amd64.tar.gz
1.2.2 资源清单与镜像
类型 | 名称 | 下载链接 | 说明 |
---|---|---|---|
yaml资源 | coredns.yaml.base | https://github.com/kubernetes/kubernetes/blob/v1.32.5/cluster/addons/dns/coredns/coredns.yaml.base | kubectl部署 |
yaml资源 | components.yaml(metrics server) | https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.2/components.yaml | kubectl部署 |
镜像 | metrics-server:v0.7.2 | registry.aliyuncs.com/google_containers/metrics-server:v0.7.2 | 部署pod… |
镜像 | coredns:v1.11.3 | registry.aliyuncs.com/google_containers/coredns:v1.11.3 | 部署pod… |
镜像 | pause:3.9 | registry.aliyuncs.com/google_containers/pause:3.9 | 部署pod… |
1.3 网络规划
网络名称 | 网段 | 备注 |
---|---|---|
Node网络 | 192.168.26.0/24 | Node IP,Node节点的IP地址,即物理机(宿主机)的网卡地址。 |
Service网络 | 168.26.0.0/16 | Cluster IP,也可叫Service IP,Service的IP地址。service-cluster-ip-range定义Service IP地址范围的参数。 |
二、容器运行时
2.1 安装docker
2.1.1 解压、创建软连接、分发
## k8s31:
~]# cd /opt/app/
app]# tar zxvf docker-28.2.2.tgz
app]# ls docker
containerd containerd-shim-runc-v2 ctr docker dockerd docker-init docker-proxy runc
app]# mv docker /opt/bin/docker-28.2.2
app]# ls /opt/bin/docker-28.2.2
containerd containerd-shim-runc-v2 ctr docker dockerd docker-init docker-proxy runc
app]# ln -s /opt/bin/docker-28.2.2/containerd /usr/bin/containerd
app]# ln -s /opt/bin/docker-28.2.2/containerd-shim-runc-v2 /usr/bin/containerd-shim-runc-v2
app]# ln -s /opt/bin/docker-28.2.2/ctr /usr/bin/ctr
app]# ln -s /opt/bin/docker-28.2.2/docker /usr/bin/docker
app]# ln -s /opt/bin/docker-28.2.2/dockerd /usr/bin/dockerd
app]# ln -s /opt/bin/docker-28.2.2/docker-init /usr/bin/docker-init
app]# ln -s /opt/bin/docker-28.2.2/docker-proxy /usr/bin/docker-proxy
app]# ln -s /opt/bin/docker-28.2.2/runc /usr/bin/runc
app]# docker -v
Docker version 28.2.2, build e6534b4
## 复制到k8s32、k8s33
app]# scp -r /opt/bin/docker-28.2.2/ root@k8s32:/opt/bin/.
app]# scp -r /opt/bin/docker-28.2.2/ root@k8s33:/opt/bin/.
## 在k8s32、k8s33创建软连接:同上
ln -s /opt/bin/docker-28.2.2/containerd /usr/bin/containerd
ln -s /opt/bin/docker-28.2.2/containerd-shim-runc-v2 /usr/bin/containerd-shim-runc-v2
ln -s /opt/bin/docker-28.2.2/ctr /usr/bin/ctr
ln -s /opt/bin/docker-28.2.2/docker /usr/bin/docker
ln -s /opt/bin/docker-28.2.2/dockerd /usr/bin/dockerd
ln -s /opt/bin/docker-28.2.2/docker-init /usr/bin/docker-init
ln -s /opt/bin/docker-28.2.2/docker-proxy /usr/bin/docker-proxy
ln -s /opt/bin/docker-28.2.2/runc /usr/bin/runc
2.1.2 创建目录、配置文件
- 在3台主机上:
## k8s31:
~]# mkdir -p /data/docker /etc/docker
~]# cat /etc/docker/daemon.json
{"data-root": "/data/docker","storage-driver": "overlay2","insecure-registries": ["harbor.oss.com:32402"],"registry-mirrors": ["https://5gce61mx.mirror.aliyuncs.com"],"bip": "10.26.31.1/24","exec-opts": ["native.cgroupdriver=systemd"],"live-restore": true
}## k8s32:
~]# mkdir -p /data/docker /etc/docker
~]# cat /etc/docker/daemon.json
{"data-root": "/data/docker","storage-driver": "overlay2","insecure-registries": ["harbor.oss.com:32402"],"registry-mirrors": ["https://5gce61mx.mirror.aliyuncs.com"],"bip": "10.26.32.1/24","exec-opts": ["native.cgroupdriver=systemd"],"live-restore": true
}## k8s33:
~]# mkdir -p /data/docker /etc/docker
~]# cat /etc/docker/daemon.json
{"data-root": "/data/docker","storage-driver": "overlay2","insecure-registries": ["harbor.oss.com:32402"],"registry-mirrors": ["https://5gce61mx.mirror.aliyuncs.com"],"bip": "10.26.33.1/24","exec-opts": ["native.cgroupdriver=systemd"],"live-restore": true
}
2.1.3 创建启动文件
]# cat /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
2.1.4 启动、检查
~]# systemctl daemon-reload
~]# systemctl start docker; systemctl enable docker
~]# systemctl status docker
app]# docker info
Client:Version: 28.2.2Context: defaultDebug Mode: falseServer:Containers: 0Running: 0Paused: 0Stopped: 0Images: 0Server Version: 28.2.2Storage Driver: overlay2Backing Filesystem: xfsSupports d_type: trueUsing metacopy: falseNative Overlay Diff: trueuserxattr: falseLogging Driver: json-fileCgroup Driver: systemdCgroup Version: 1Plugins:Volume: localNetwork: bridge host ipvlan macvlan null overlayLog: awslogs fluentd gcplogs gelf journald json-file local splunk syslogCDI spec directories:/etc/cdi/var/run/cdiSwarm: inactiveRuntimes: io.containerd.runc.v2 runcDefault Runtime: runcInit Binary: docker-initcontainerd version: 05044ec0a9a75232cad458027ca83437aae3f4darunc version: v1.2.6-0-ge89a299init version: de40ad0Security Options:seccompProfile: builtinKernel Version: 4.18.0-348.7.1.el8_5.x86_64Operating System: CentOS Linux 8OSType: linuxArchitecture: x86_64CPUs: 2Total Memory: 3.623GiBName: k8s31.vm.comID: 3177934b-9db6-4161-86b2-236cc2465eefDocker Root Dir: /data/dockerDebug Mode: falseExperimental: falseInsecure Registries:harbor.oss.com:32402::1/128127.0.0.0/8Registry Mirrors:https://5gce61mx.mirror.aliyuncs.com/Live Restore Enabled: trueProduct License: Community Engine
app]# docker version
Client:Version: 28.2.2API version: 1.50Go version: go1.24.3Git commit: e6534b4Built: Fri May 30 12:07:14 2025OS/Arch: linux/amd64Context: defaultServer: Docker Engine - CommunityEngine:Version: 28.2.2API version: 1.50 (minimum version 1.24)Go version: go1.24.3Git commit: 45873beBuilt: Fri May 30 11:52:20 2025OS/Arch: linux/amd64Experimental: falsecontainerd:Version: v1.7.27GitCommit: 05044ec0a9a75232cad458027ca83437aae3f4darunc:Version: 1.2.6GitCommit: v1.2.6-0-ge89a299docker-init:Version: 0.19.0GitCommit: de40ad0
app]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft foreverinet6 ::1/128 scope hostvalid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000link/ether 00:0c:29:5c:14:08 brd ff:ff:ff:ff:ff:ffinet 192.168.26.31/24 brd 192.168.26.255 scope global noprefixroute ens160valid_lft forever preferred_lft foreverinet6 fe80::20c:29ff:fe5c:1408/64 scope linkvalid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group defaultlink/ether a2:80:b6:3a:f4:84 brd ff:ff:ff:ff:ff:ffinet 10.26.31.1/24 brd 10.26.31.255 scope global docker0valid_lft forever preferred_lft forever
2.1.5 测试
拉取centos镜像,启动容器,查看容器IP,IP按配置10.26.31.1/24
生成。
~]# docker pull centos:centos7.9.2009
...
~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
...
centos centos7.9.2009 eeb6ee3f44bd 3 years ago 204MB
~]# docker run -i -t --name test centos:centos7.9.2009 /bin/bash
[root@1772d7516332 /]# cat /etc/hosts ## 没有ip a 或ifconfig命令,查看/etc/hosts中的IP
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00:: ip6-localnet
ff00:: ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
10.26.31.2 1772d7516332~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1772d7516332 centos:centos7.9.2009 "/bin/bash" 2 minutes ago Exited (127) 6 seconds ago test
~]# docker rm test
~]# docker pull centos:8
...
~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
...
centos 8 5d0da3dc9764 3 years ago 231MB
[root@bf7b07f62241 /]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft foreverinet6 ::1/128 scope hostvalid_lft forever preferred_lft forever
2: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group defaultlink/ether f6:77:a4:66:00:b9 brd ff:ff:ff:ff:ff:ff link-netnsid 0inet 10.26.31.2/24 brd 10.26.31.255 scope global eth0valid_lft forever preferred_lft forever
[root@bf7b07f62241 /]# exit
2.2 安装cri-dockerd
为什么要安装cri-dockerd插件?
-
K8s在刚开源时没有自己的容器引擎,而当时docker非常火爆是容器的代表,所以就在kubelet的代码里集成了对接docker的代码——docker shim,所以1.24版本之前是默认使用docker,不需要安装cri-dockerd。
-
K8s 1.24版本移除 docker-shim的代码,而 Docker Engine 默认又不支持CRI标准,因此二者默认无法再直接集成。为此,Mirantis 和 Docker 为了Docker Engine 提供一个能够支持到CRI规范的桥梁,就联合创建了cri-dockerd,从而能够让 Docker 作为K8s 容器引擎。
-
截至目前2023年10月20日,k8s已经更新至1.28.3版。从v1.24起,Docker不能直接作为k8s的容器运行时,因为在k8s v1.24版本移除了叫dockershim的组件,这是由k8s团队直接维护而非Docker团队维护的组件,这意味着Docker和k8s的关系不再像原来那般亲密,开发者需要使用其它符合CRI(容器运行时接口)的容器运行时工具(如containerd, CRI-O等),当然这并不意味着新版本的k8s彻底抛弃Docker(由于Docker庞大的生态和广泛的群众基础,显然这并不容易办到),在原本安装了Docker的基础上,可以通过补充安装cri-dockerd,以满足容器运行时接口的条件,从某种程度上说,cri-dockerd就是翻版的dockershim。
2.2.1 解压、创建软链接、分发
k8s31 app]# tar -zxvf cri-dockerd-0.3.17.amd64.tgz
k8s31 app]# mv cri-dockerd /opt/bin/cri-dockerd-0.3.17
k8s31 app]# ll /opt/bin/cri-dockerd-0.3.17
总用量 49732
-rwxr-xr-x 1 1001 118 50921624 4月 1 10:38 cri-dockerd
k8s31 app]# ln -s /opt/bin/cri-dockerd-0.3.17/cri-dockerd /usr/local/bin/cri-dockerd
k8s31 app]# cri-dockerd --version
cri-dockerd 0.3.17 (483e3b6)
## 复制:
k8s31 app]# scp -r /opt/bin/cri-dockerd-0.3.17 root@k8s32:/opt/bin/.
k8s31 app]# scp -r /opt/bin/cri-dockerd-0.3.17 root@k8s33:/opt/bin/.
## 创建软链接:同上
ln -s /opt/bin/cri-dockerd-0.3.17/cri-dockerd /usr/local/bin/cri-dockerd
2.2.2 修改配置文件、启动
]# cat /usr/lib/systemd/system/cri-dockerd.service
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
# Requires=cri-docker.socket ## 如果启动报错,则注释掉这一行[Service]
Type=notify
ExecStart=/usr/local/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
Delegate=yes
KillMode=process[Install]
WantedBy=multi-user.target
]# systemctl daemon-reload
]# systemctl enable cri-dockerd && systemctl start cri-dockerd]# systemctl status cri-dockerd
● cri-dockerd.service - CRI Interface for Docker Application Container EngineLoaded: loaded (/usr/lib/systemd/system/cri-dockerd.service; disabled; vendor preset: disabled)Active: active (running) since 三 2024-02-14 23:04:32 CST; 15s agoDocs: https://docs.mirantis.comMain PID: 7950 (cri-dockerd)Tasks: 8Memory: 8.5MCGroup: /system.slice/cri-dockerd.service└─7950 /usr/local/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9...
三、通过二进制文件安装Kubernetes集群
通过 kubeadm 的确可以快速安装 Kubernetes 集群,但如果需要调整 Kubernetes 各组件服务的参数,以及安全设置、高可用模式等,就需要通过二进制文件安装 Kubemmetes 集群了。
本节基于 Kubernetes v1.32.5 版本,通过二进制文件对如何配置、部署一个启用了安全机制且有3个节点的高可用Kubernetes集群进行说明。可以适当简化测试环境,将某些组件部署为单节点模式。
3.1 Master的高可用部署架构
在 Kubermetes 系统中,Master 通过不间断地与各个 Node 通信来维护整个集群的健康工作状态,集群中各资源对象的状态则被保存在 etcd中。如果 Master 不能正常工作,则各个 Node 会处于不可管理状态,用户无法管理在各个Node 上运行的 Pod。
所以,在正式环境下应确保 Master 高可用,并启用安全访问机制,其中要至少注意以下几方面:
- Master 的 kube-apiserver、kube-controller-manager 和kube-scheduler 服务以多实例方式部署,至少有3个节点,节点需要为奇数数量,通常建议为5个或7个。
- Master启用了基于 CA 认证的 HTTPS 安全机制。
- etcd 至少以有3个节点的集群模式部署。
- etcd 集群启用了基于 CA 认证的 HTTPS 安全机制。
- Master启用了 RBAC 授权模式。
Master的高可用部署架构如图所示:
在 Master 的3个节点之前,应通过一个负载均衡器提供对客户端的唯一访问人口地址。对于负载均衡器,可以选择使用硬件负载均衡器或者软件负载均衡器进行搭建。如果选择使用软件负载均衡器,则可选择的开源方案较多,本方案以HAProxy搭配keepalived为例进行说明。主流硬件负载均衡器有 F5、A10等,需要额外采购,其负载均衡配置规则与软件负载均衡器的配置规则类似,不再赘述。
本方案中3台主机的IP地址分别为192.168.26.31、192.168.26.32、192.168.26.33,负载均衡器使用的虚拟IP(VirtualIP,VIP)地址为 192.168.26.100。
下面对 etcd、负载均衡器、Master、Node 等组件进行高可用部署、关键配置、CA 证书配置等。
3.2 创建CA根证书
为了启用 etcd 和 Kubernetes 服务基于 CA 认证的安全机制,首先需要配置 CA 证书。如果可以通过某个可信任的 CA 中心获取证书,则可以使用其颁发的CA证书来完成系统配置。如果没有可信任的 CA中心,则也可以通过自行制作CA证书来完成系统配置。
etcd 和 Kubernetes 在制作 CA证书时,均需要基于 CA根证书。以 Kubermetes 和etcd 使用同一套 CA 根证书为例,对如何制作 CA证书进行说明。可以使用 OpenSSL、easyrsa、CFSSL等工具来制作CA证书,以 OpenSSL为例进行说明。
创建 CA 根证书的命令如下,其中包括私钥文件 ca.key 和证书文件 ca.crt:
cert]# openssl genrsa -out ca.key 2048cert]# openssl req -x509 -new -nodes -key ca.key -subj "/CN=192.168.26.63" -days 36500 -out ca.crt
3.3 部署安全的 etcd 高可用集群
3.3.1. 下载 etcd 二进制文件,解压,链接,配置 systemd 服务
将 etcd 二进制文件解压缩后可以得到 etcd 和 etcdctl 文件,首先将它们链接到/usr/bin目录下:
## k8s31:
~]# cd /opt/app/
app]# tar zxvf etcd-v3.6.0-linux-amd64.tar.gz
app]# ls etcd-v3.6.0-linux-amd64
Documentation etcd etcdctl etcdutl README-etcdctl.md README-etcdutl.md README.md READMEv2-etcdctl.md
app]# mv etcd-v3.6.0-linux-amd64 /opt/bin/etcd-v3.6.0
app]# ls /opt/bin/etcd-v3.6.0
Documentation etcd etcdctl etcdutl README-etcdctl.md README-etcdutl.md README.md READMEv2-etcdctl.md
app]# ln -s /opt/bin/etcd-v3.6.0/etcd /usr/bin/etcd
app]# ln -s /opt/bin/etcd-v3.6.0/etcdctl /usr/bin/etcdctl
app]# etcd --version
etcd Version: 3.6.0
Git SHA: f5d605a
Go Version: go1.23.9
Go OS/Arch: linux/amd64
## 复制:
k8s31 app]# scp -r /opt/bin/etcd-v3.6.0 root@k8s32:/opt/bin/.
k8s31 app]# scp -r /opt/bin/etcd-v3.6.0 root@k8s33:/opt/bin/.
## 创建软链接:同上
ln -s /opt/bin/etcd-v3.6.0/etcd /usr/bin/etcd
ln -s /opt/bin/etcd-v3.6.0/etcdctl /usr/bin/etcdctl
然后将其部署为一个 systemd 服务,创建 systemd 服务的配置文件/usr/ib/systemdsystemVetcd.service:
# /usr/lib/systemd/system/etcd.service
[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target[Service]
EnvironmentFile=/opt/cfg/etcd.conf
ExecStart=/usr/bin/etcd
Restart=always[Install]
WantedBy=multi-user.target
其中,EnvironmentFile指定配置文件的全路径,例如/opt/cfg/etcd.conf,以环境变量的格式配置其中的参数。
接下来配置 etcd需要的 CA 证书。对于配置文件/opt/cfg/etcd.conf中的完整配置参数将在创建完 CA 证书后统一配置。
3.3.2. 创建 etcd 的 CA 证书
首先创建一个x509 v3 配置文件 etcd ssl.cnf,其中的 subjectAltName 参数(alt_names)包括所有 etcd 主机的 IP地址。
# /opt/cert/etcd_ssl.cnf
[ req ]
req_extensions = v3_req
distinguished_name = req_distinguished_name[ req_distinguished_name ][ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names[ alt_names ]
IP.1 = 192.168.26.31
IP.2 = 192.168.26.32
IP.3 = 192.168.26.33
然后通过openssl命令创建etcd的服务端CA证书,包括 etcd server.key 和etcd server.crt 文件,将其保存到/opt/cert目录下:
openssl genrsa -out etcd_server.key 2048openssl req -new -key etcd_server.key -config etcd_ssl.cnf -subj "/CN=etcd-server" -out etcd_server.csropenssl x509 -req -in etcd_server.csr -CA /opt/cert/ca.crt -CAkey /opt/cert/ca.key -CAcreateserial -days 36500 -extensions v3_req -extfile etcd_ssl.cnf -out etcd_server.crt
最后创建客户端使用的CA证书,包括etcd client.key和 etcd client.crt文件,也将其保存到/opt/cert目录下,后续供 kube-apiserver 连接 etcd 时使用:
openssl genrsa -out etcd_client.key 2048openssl req -new -key etcd_client.key -config etcd_ssl.cnf -subj "/CN=etcd-client" -out etcd_client.csropenssl x509 -req -in etcd_client.csr -CA /opt/cert/ca.crt -CAkey /opt/cert/ca.key -CAcreateserial -days 36500 -extensions v3_req -extfile etcd_ssl.cnf -out etcd_client.crt
cert]# ls -l
总用量 40
-rw-r--r-- 1 root root 1127 6月 1 06:08 ca.crt
-rw------- 1 root root 1675 6月 1 06:07 ca.key
-rw-r--r-- 1 root root 41 6月 1 06:33 ca.srl
-rw-r--r-- 1 root root 1086 6月 1 06:33 etcd_client.crt
-rw-r--r-- 1 root root 989 6月 1 06:32 etcd_client.csr
-rw------- 1 root root 1675 6月 1 06:32 etcd_client.key
-rw-r--r-- 1 root root 1086 6月 1 06:31 etcd_server.crt
-rw-r--r-- 1 root root 989 6月 1 06:28 etcd_server.csr
-rw------- 1 root root 1679 6月 1 06:27 etcd_server.key
-rw-r--r-- 1 root root 311 6月 1 06:26 etcd_ssl.cnf
## 复制:
k8s31 app]# scp -r /opt/cert/* root@k8s32:/opt/cert/.
k8s31 app]# scp -r /opt/cert/* root@k8s33:/opt/cert/.
3.3.3. 对 etcd 参数的配置说明
接下来配置3个 etcd 节点。etcd 节点的配置方式包括启动参数、环境变量、配置文件等,通过环境变量方式将其配置到/opt/cfg/etcd.conf文件中,供systemd 服务读取。
3个etcd节点将被部署在192.168.26.31、192.168.26.32和192.168.26.33这3台主机上,配置文件/opt/cfg/etcd.conf的内容:
]# mkdir -p /opt/etcd/data
## /opt/cfg/etcd.conf - node 1
ETCD_NAME=etcd1
ETCD_DATA_DIR=/opt/etcd/dataETCD_CERT_FILE=/opt/cert/etcd_server.crt
ETCD_KEY_FILE=/opt/cert/etcd_server.key
ETCD_TRUSTED_CA_FILE=/opt/cert/ca.crt
ETCD_CLIENT_CERT_AUTH=true
ETCD_LISTEN_CLIENT_URLS=https://192.168.26.31:2379
ETCD_ADVERTISE_CLIENT_URLS=https://192.168.26.31:2379ETCD_PEER_CERT_FILE=/opt/cert/etcd_server.crt
ETCD_PEER_KEY_FILE=/opt/cert/etcd_server.key
ETCD_PEER_TRUSTED_CA_FILE=/opt/cert/ca.crt
ETCD_LISTEN_PEER_URLS=https://192.168.26.31:2380
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.26.31:2380ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.26.31:2380,etcd2=https://192.168.26.32:2380,etcd3=https://192.168.26.33:2380"
ETCD_INITIAL_CLUSTER_STATE=new
## /opt/cfg/etcd.conf - node 2
ETCD_NAME=etcd2
ETCD_DATA_DIR=/opt/etcd/dataETCD_CERT_FILE=/opt/cert/etcd_server.crt
ETCD_KEY_FILE=/opt/cert/etcd_server.key
ETCD_TRUSTED_CA_FILE=/opt/cert/ca.crt
ETCD_CLIENT_CERT_AUTH=true
ETCD_LISTEN_CLIENT_URLS=https://192.168.26.32:2379
ETCD_ADVERTISE_CLIENT_URLS=https://192.168.26.32:2379ETCD_PEER_CERT_FILE=/opt/cert/etcd_server.crt
ETCD_PEER_KEY_FILE=/opt/cert/etcd_server.key
ETCD_PEER_TRUSTED_CA_FILE=/opt/cert/ca.crt
ETCD_LISTEN_PEER_URLS=https://192.168.26.32:2380
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.26.32:2380ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.26.31:2380,etcd2=https://192.168.26.32:2380,etcd3=https://192.168.26.33:2380"
ETCD_INITIAL_CLUSTER_STATE=new
## /opt/cfg/etcd.conf - node 3
ETCD_NAME=etcd3
ETCD_DATA_DIR=/opt/etcd/dataETCD_CERT_FILE=/opt/cert/etcd_server.crt
ETCD_KEY_FILE=/opt/cert/etcd_server.key
ETCD_TRUSTED_CA_FILE=/opt/cert/ca.crt
ETCD_CLIENT_CERT_AUTH=true
ETCD_LISTEN_CLIENT_URLS=https://192.168.26.33:2379
ETCD_ADVERTISE_CLIENT_URLS=https://192.168.26.33:2379ETCD_PEER_CERT_FILE=/opt/cert/etcd_server.crt
ETCD_PEER_KEY_FILE=/opt/cert/etcd_server.key
ETCD_PEER_TRUSTED_CA_FILE=/opt/cert/ca.crt
ETCD_LISTEN_PEER_URLS=https://192.168.26.33:2380
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.26.33:2380ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.26.31:2380,etcd2=https://192.168.26.32:2380,etcd3=https://192.168.26.33:2380"
ETCD_INITIAL_CLUSTER_STATE=new
3.3.4. 启动 etcd 集群
基于 systemd的配置,在3台主机上首先分别启动 etcd 服务,并设置为开机自启动:
## 启动etcd服务,并设置为开机自启动
]# systemctl restart etcd && systemctl enable etcd
]# systemctl status etcd
然后用 etcdctl客户端命令行工具携带客户端 CA证书,通过 etcdctl endpoint health 命令访问 etcd 集群,验证集群状态是否正常。
## 通过etcdctl endpoint health命令访问etcd集群,验证集群状态是否正常
]# etcdctl --cacert=/opt/cert/ca.crt --cert=/opt/cert/etcd_client.crt --key=/opt/cert/etcd_client.key --endpoints=https://192.168.26.31:2379,https://192.168.26.32:2379,https://192.168.26.33:2379 endpoint health
https://192.168.26.33:2379 is healthy: successfully committed proposal: took = 7.782435ms
https://192.168.26.31:2379 is healthy: successfully committed proposal: took = 7.786204ms
https://192.168.26.32:2379 is healthy: successfully committed proposal: took = 11.706174ms
结果显示各节点状态均为“healthy”,说明集群正常运行。至此,一个启用了 HTTPS的有3个节点的 etcd集群就部署成功。
3.4 部署安全的 Kubernetes Master 高可用集群
3.4.1. 下载 Kubernetes 服务的二进制文件
https://github.com/kubernetes/kubernetes
首先,从Kubernetes 的官方 GitHub 代码库页面下载各组件的二进制文件,在 Releases页面找到需要下载的版本号,单击CHANGELOG链接,跳转到已编译好的Server端二进制(Server Binaries)文件的下载页面下载该文件。
在 Server Binaries 压缩包中包含 Kubernetes 的所有服务端程序的二进制文件和容器镜像文件,并根据不同的系统架构分别提供二进制文件,例如 amd64 表示x86架构,arm64表示 arm 架构,等等,根据目标环境的要求选择正确的文件进行下载。
在Kubernetes 的 Master 上需要部署的服务包括 etcd、kube-apiserver、kube-controller-manager 和kube-scheduler。
在Node上需要部署的服务包括容器运行时(如docker、containerd)、kubelet 和 kube-proxy。
将Kubernetes 的二进制可执行文件链接到usr/bin目录下,然后在/usr/lib/systemd/system目录下为各服务创建systemd服务的配置文件,完成Kubernetes服务的安装。
## k8s31:
~]# cd /opt/app/
app]# tar zxvf kubernetes-server-linux-amd64.tar.gz
...
app]# mkdir /opt/bin/kubernetes-1.32.5
app]# mv kubernetes/server/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubelet,kube-proxy,kubectl} /opt/bin/kubernetes-1.32.5/.
[root@k8s31 app]# ll /opt/bin/kubernetes-1.32.5/.
总用量 436240
-rwxr-xr-x 1 root root 93266072 5月 15 05:23 kube-apiserver
-rwxr-xr-x 1 root root 85991576 5月 15 05:23 kube-controller-manager
-rwxr-xr-x 1 root root 57327768 5月 15 05:23 kubectl
-rwxr-xr-x 1 root root 77410564 5月 15 05:23 kubelet
-rwxr-xr-x 1 root root 66842776 5月 15 05:23 kube-proxy
-rwxr-xr-x 1 root root 65847448 5月 15 05:23 kube-scheduler
app]# ln -s /opt/bin/kubernetes-1.32.5/kube-apiserver /usr/bin/kube-apiserver
app]# ln -s /opt/bin/kubernetes-1.32.5/kube-controller-manager /usr/bin/kube-controller-manager
app]# ln -s /opt/bin/kubernetes-1.32.5/kubectl /usr/bin/kubectl
app]# ln -s /opt/bin/kubernetes-1.32.5/kubelet /usr/bin/kubelet
app]# ln -s /opt/bin/kubernetes-1.32.5/kube-proxy /usr/bin/kube-proxy
app]# ln -s /opt/bin/kubernetes-1.32.5/kube-scheduler /usr/bin/kube-scheduler
app]# kubectl version
Client Version: v1.32.5
Kustomize Version: v5.5.0
The connection to the server localhost:8080 was refused - did you specify the right host or port?
## 复制:
k8s31 app]# scp -r /opt/bin/kubernetes-1.32.5 root@k8s32:/opt/bin/.
k8s31 app]# scp -r /opt/bin/kubernetes-1.32.5 root@k8s33:/opt/bin/.
## 创建软链接:同上
ln -s /opt/bin/kubernetes-1.32.5/kube-apiserver /usr/bin/kube-apiserver
ln -s /opt/bin/kubernetes-1.32.5/kube-controller-manager /usr/bin/kube-controller-manager
ln -s /opt/bin/kubernetes-1.32.5/kubectl /usr/bin/kubectl
ln -s /opt/bin/kubernetes-1.32.5/kubelet /usr/bin/kubelet
ln -s /opt/bin/kubernetes-1.32.5/kube-proxy /usr/bin/kube-proxy
ln -s /opt/bin/kubernetes-1.32.5/kube-scheduler /usr/bin/kube-scheduler
3.4.2. 部署 kube-apiserver服务
(1) 设置kube-apiserver 服务需要的 CA 相关证书。首先准备一个 x509 v3 版本的证书配置文件(master ssl.cnf):
# CA证书配置
# /opt/cert/master_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name][ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names[alt_names]
DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster.local
DNS.5 = k8s-1
DNS.6 = k8s-2
DNS.7 = k8s-3
IP.1 = 168.26.0.1
IP.2 = 192.168.26.31
IP.3 = 192.168.26.32
IP.4 = 192.168.26.33
IP.5 = 192.168.26.100
在该文件的 subjectAltName 字段([alt names])设置 Master 服务的全部域名和 IP 地址。
- DNS 主机名,例如k8s-1、k8s-2、k8s-3等。
- Master Service 的虚拟服务名称,例如 kubernetes.default等。
- IP 地址,包括各 kube-apiserver所在主机的 IP 地址和负载均衡器的 IP地址,例如192.168.26.31、192.168.26.32、192.168.26.33和192.168.26.100。
- Master Service虚拟服务的 ClusterIP 地址,例如 168.26.0.1。
然后通过 openssl 命令创建 kube-apiserver 的服务端 CA 证书,包括 apiserver.key 和apiserver.crt 文件,将其保存到/opt/cert目录下:
# 创建服务端CA证书openssl genrsa -out apiserver.key 2048openssl req -new -key apiserver.key -config master_ssl.cnf -subj "/CN=apiserver" -out apiserver.csropenssl x509 -req -in apiserver.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 36500 -extensions v3_req -extfile master_ssl.cnf -out apiserver.crt
## 复制
scp -r /opt/cert/apiserver* root@k8s32:/opt/cert/.
scp -r /opt/cert/apiserver* root@k8s33:/opt/cert/.
(2) 为kube-apiserver 服务创建systemd 服务的配置文件/usr/lib/systemd/system/kube-apiserver.service,在该文件中,EnvironmentFile 参数指定将/etc/kubernetes/apiserver文件作为环境文件,其中通过变量KUBE_API_ARGS设置kube-apiserver的启动参数:
## Kubernetes各服务的配置
## /usr/lib/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes[Service]
EnvironmentFile=/opt/cfg/apiserver
ExecStart=/usr/bin/kube-apiserver $KUBE_API_ARGS
Restart=always[Install]
WantedBy=multi-user.target
(3) 在环境文件/opt/cfg/apiserver中,配置变量KUBE_API_ARGS 的值为kube-apiserver的全部启动参数:
## /opt/cfg/apiserver
KUBE_API_ARGS="--secure-port=6443 \
--tls-cert-file=/opt/cert/apiserver.crt \
--tls-private-key-file=/opt/cert/apiserver.key \
--client-ca-file=/opt/cert/ca.crt \
--apiserver-count=3 --endpoint-reconciler-type=master-count \
--etcd-servers=https://192.168.26.31:2379,https://192.168.26.32:2379,https://192.168.26.33:2379 \
--etcd-cafile=/opt/cert/ca.crt \
--etcd-certfile=/opt/cert/etcd_client.crt \
--etcd-keyfile=/opt/cert/etcd_client.key \
--service-cluster-ip-range=168.26.0.0/16 \
--service-node-port-range=30000-32767 \
--service-account-issuer=https://kubernetes.default.svc.cluster.local \
--service-account-signing-key-file=/opt/cert/apiserver.key \
--service-account-key-file=/opt/cert/apiserver.key \
--allow-privileged=true"
- –secure-port:HTTPS端口号,默认值为6443。
- –tls-cert-fle:服务端CA证书文件的全路径,例如/opt/cert/apiserver.crt。
- –tls-private-key-fle:服务端CA私钥文件的全路径,例如/opt/cert/apiserver. key。
- –client-ca-file:CA根证书的全路径,例如/opt/cert/ca.crt。
- –apiserver-count:API Server 实例的数量,例如3,需要同时设置参数–endpoint-reconciler-type=master-count。
- –etcd-servers:连接 etcd 的 URL列表,这里使用 HTTPS,例如 https://192.168.26.31:2379、https://192.168.26.32:2379和https://192.168.26.33:2379。
- –etcd-cafile:etcd 使用的 CA根证书文件的全路径,例如/opt/cert/ca.crt。
- –etcd-certfile:etcd客户端CA证书文件的全路径,例如/opt/cert/etcd_client.crt。
- –etcd-keyfile:etcd 客户端私钥文件的全路径,例如/opt/cert/etcd_ client.key。
- –service-cluster-ip-range:Service虚拟 IP地址的范围,以 CIDR 格式表示,例如168.26.0.0/16,在该范围内不可出现物理机的 IP地址。
- –service-node-port-range:Service可使用的物理机端口号范围,默认值为 30000~32767。
- –allow-privileged:是否允许容器以特权模式运行,默认值为true。
(4) 在配置文件准备完毕后,在3台主机上分别启动 kube-apiserver 服务,并设置为开机自启动:
systemctl start kube-apiserver && systemctl enable kube-apiserver
systemctl status kube-apiserver
3.4.3. 创建客户端 CA 证书
kube-controller-manager、kube-scheduler、kubelet 和 kube-proxy 服务作为客户端连接kube-apiserver 服务,需要为它们创建客户端 CA证书,使其能够正确访问 kube-apiserver。对这几个服务统一创建一个证书。
通过 openssl 命令创建 CA 证书和私钥文件
## 创建客户端CA证书
openssl genrsa -out client.key 2048openssl req -new -key client.key -subj "/CN=admin" -out client.csropenssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt -days 36500
其中,-subj 参数中“/CN”的名称可以被设置为“admin”,用于标识连接 kube-apiserver的客户端用户的名称。
将生成的 client.key和 client.crt文件保存在/opt/cert目录下。
## 复制
scp -r /opt/cert/client* root@k8s32:/opt/cert/.
scp -r /opt/cert/client* root@k8s33:/opt/cert/.
3.4.4. 创建客户端连接 kube-apiserver 服务所需的 kubeconfig 配置文件
为 kube-controller-manager、kube-scheduler、kubelet 和 kube-proxy 服务统一创建一个 kubeconfg 文件作为连接 kube-apiserver 服务的配置文件,后续也作为 kubectl 连接kube-apiserver 服务的配置文件。
在 kubeconfig文件中主要设置访问 kube-apiserver的URL地址及所需 CA 证书等相关参数:
## /opt/cfg/kubeconfig
apiVersion: v1
kind: Config
clusters:
- name: defaultcluster:server: https://192.168.26.100:9443certificate-authority: /opt/cert/ca.crt
users:
- name: adminuser:client-certificate: /opt/cert/client.crtclient-key: /opt/cert/client.key
contexts:
- context:cluster: defaultuser: adminname: default
current-context: default
- server URL地址:配置为负载均衡器(HAProxy)使用的虚拟IP地址(192.168.26.100)和HAProxy监听的端口号(9443)。
- client-certificate:配置为客户端证书文件(client.crt)的全路径。
- client-key:配置为客户端私钥文件(client.key)的全路径。
- certificate-authority:配置为CA根证书(ca.crt)的全路径。
- users中的user name和context中的user:连接API Server 的用户名,设置为与客户端证书中的“/CN”名称保持一致(“admin”)。
将 kubeconfig 文件保存到/opt/cfg目录下。
3.4.5. 部署 kube-controller-manager 服务
(1) 为 kube-controller-manager 服务创建 systemd 服务的配置文件/usr/lib/systemd/system/kube-controller-manager.service,其中 EnvironmentFile参数指定使用/etc/kubernetes/controller-manager 文件作为环境文件,在该环境文件中通过变量KUBE_CONTROLLER_MANAGER_ARGS 设置 kube-controller-manager的启动参数:
## /usr/lib/systemd/system/kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes[Service]
EnvironmentFile=/opt/cfg/controller-manager
ExecStart=/usr/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_ARGS
Restart=always[Install]
WantedBy=multi-user.target
(2) 在环境文件/etc/kubernetes/controller-manager中,配置变量 KUBE_CONTROLLER_MANAGER_ARGS的值为 kube-controller-manager的全部启动参数:
## vi
KUBE_CONTROLLER_MANAGER_ARGS="--kubeconfig=/opt/cfg/kubeconfig \
--leader-elect=true \
--service-cluster-ip-range=168.26.0.0/16 \
--service-account-private-key-file=/opt/cert/apiserver.key \
--root-ca-file=/opt/cert/ca.crt"
- –kubeconfig:与APIServer连接的相关配置。
- –leader-elect:启用选举机制,在有3个节点的环境下应被设置为“true“。
- –service-account-private-key-file:为ServiceAccount自动颁发 token 使用的私钥文件的全路径/etc/kubernetes/pki/apiserver.key。
- –root-ca-file:CA根证书的全路径/etc/kubernetes/pki/ca.crt。
- –service-cluster-ip-range:Service 的虚拟IP 地址范围,以 CIDR 格式表示,例如169.169.0.0/16,与kube-apiserver 服务中的配置保持一致。
(3) 在配置文件准备完毕后,在3台主机上分别启动 kube-controller-manager 服务,并设置为开机自启动:
systemctl daemon-reload
systemctl start kube-controller-manager && systemctl enable kube-controller-manager
systemctl status kube-controller-manager
3.4.6. 部署 kube-scheduler 服务
(1 )为kube-scheduler 服务创建systemd 服务的配置文件/usr/lib/systemd/system/kube-scheduler.service,其中 EnvironmentFile 参数指定使用/etc/kubernetes/scheduler 文件作为环境文件,在该环境文件中通过变量KUBE_SCHEDULER_ARGS设置 kube-scheduler 的启动参数:
## /usr/lib/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes[Service]
EnvironmentFile=/opt/cfg/scheduler
ExecStart=/usr/bin/kube-scheduler $KUBE_SCHEDULER_ARGS
Restart=always[Install]
WantedBy=multi-user.target
(2) 在环境文件/opt/cfg/scheduler中,配置变量KUBE_SCHEDULER_ARGS 的值为 kube-scheduler的全部启动参数:
## /opt/cfg/scheduler
KUBE_SCHEDULER_ARGS="--kubeconfig=/opt/cfg/kubeconfig \
--leader-elect=true"
- –kubeconfig:与APIServer连接的相关配置。
- –leader-elect:启用选举机制,在有3个节点的环境下应被设置为“true”。
(3) 在配置文件准备完毕后,在3台主机上分别启动 kube-scheduler 服务,并设置为开机自启动:
systemctl start kube-scheduler && systemctl enable kube-scheduler
systemctl status kube-scheduler
通过 systemctl status 验证服务的启动状态,若状态为“running”并且没有报错日志,则表示启动成功。
~]# kubectl --kubeconfig=/opt/cfg/kubeconfig get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy ok
3.4.7. 使用 HAProxy 和 keepalived 部署高可用负载均衡器
接下来,在3个kube-apiserver服务的前端部署HAProxy和keepalived,将VIP地址 192.168.26.100作为Master的唯一入口地址,供客户端访问。
将 HAProxy和keepalived均部署为至少有两个实例的高可用架构,以免发生单点故障。下面在 192.168.26.31和 192.168.26.32两台服务器上进行部署。HAProxy 负责将客户端的请求转发到后端的3个kube-apiserver 实例上,keepalived 负责保证虚拟 IP地址192.168.26.100的高可用。HAProxy和 keepalived 的部署架构如图所示:
接下来部署 HAProxy 和 keepalived 实例。
1) 部署两个 HAProxy 实例
准备 HAProxy 的配置文件 haproxy.cfg:
## /opt/cfg/haproxy.cfg
globallog 127.0.0.1 local2chroot /var/lib/haproxypidfile /var/run/haproxy.pidmaxconn 4096user haproxygroup haproxydaemonstats socket /var/lib/haproxy/statsdefaultsmode httplog globaloption httplogoption dontlognulloption http-server-closeoption forwardfor except 127.0.0.0/8option redispatchretries 3timeout http-request 10stimeout queue 1mtimeout connect 10stimeout client 1mtimeout server 1mtimeout http-keep-alive 10stimeout check 10smaxconn 3000frontend kube-apiservermode tcpbind *:9443option tcplogdefault_backend kube-apiserverlisten statsmode httpbind *:8888stats auth admin:passwordstats refresh 5sstats realm HAProxy\ Statisticsstats uri /statslog 127.0.0.1 local3 errbackend kube-apiservermode tcpbalance roundrobinserver k8s-master1 192.168.26.31:6443 checkserver k8s-master2 192.168.26.32:6443 checkserver k8s-master3 192.168.26.33:6443 check
- frontend:HAProxy的监听协议和端口号,使用TCP,端口号为9443
- backend:后端的3个kube-apiserver的地址,以IP:Port 形式表示,例如192.168.26.31:6443、192.168.26.32:6443和192.168.26.33:6443;mode 字段用于设置协议,此处的值为“tcp”;balance 字段用于设置负载均衡策略,例如roundrobin为轮询模式。
- listen stats:状态监控的服务配置,其中,bind用于设置监听端口号为8888;stats auth用于设置访问账号和密码(这里设置为admin:password);stats uri用于设置访问的URL路径,例如/stats。
通过 Docker 容器运行 HAProxy 且镜像使用 haproxytech/haproxy-debian:
在两台服务器192.168.26.31和192.168.26.32上启动HAProxy,将配置文件 haproxy.cfg挂载到容器的/usr/local/etc/haproxy 目录下,启动命令如下:
## 拉取镜像haproxytech/haproxy-debian:2.3
~]# docker pull haproxytech/haproxy-debian:2.3
## 启动haproxy
~]# docker run -d --name k8s-haproxy \--net=host \--restart=always \-v /opt/cfg/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro \haproxytech/haproxy-debian:2.3
~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d15f938306ee haproxytech/haproxy-debian:2.3 "/docker-entrypoint.…" 18 seconds ago Up 17 seconds k8s-haproxy
在一切正常的情况下,通过浏览器访问 http://192.168.26.31:8888/stats 这一地址即可访问 HAProxy 的管理页面,输入用户名和密码(admin:password)登录后査看到的主页界面如图所示:
这里主要关注最后一个表格,其内容为 haproxy.cfg配置文件中 backend 配置的3个kube-apiserver地址,它们的状态均为“UP",表示与3个kube-apiserver 服务成功建立连接,说明 HAProxy 工作正常。
2) 部署两个 keepalived 实例
keepalived 用于维护虚拟IP地址的高可用,同样在192.168.26.31和192.168.26.32 两台服务器上部署。其中主要需要配置keepalived对HAProxy运行状态的监控,当某个HAProxy实例不可用时,自动将虚拟 IP地址切换到另一台主机上。
在第1台服务器 192.168.26.31上创建配置文件 keepalived.conf:
## /opt/cfg/keepalived.conf - master 1
! Configuration File for keepalivedglobal_defs {router_id LVS_1
}vrrp_script checkhaproxy
{script "/opt/bin/check-haproxy.sh"interval 2weight -30
}vrrp_instance VI_1 {state MASTERinterface ens160virtual_router_id 51priority 100advert_int 1virtual_ipaddress {192.168.26.100/24 dev ens160}authentication {auth_type PASSauth_pass password}track_script {checkhaproxy}
}
在 vrrp_instance 字段设置主要参数。
- vrrp_instance VI_1:设置 keepalived 虚拟路由器组(VRRP)的名称。
- state:设置为“MASTER”,将其他keepalived均设置为“BACKUP”。
- interface:待设置虚拟IP地址的网卡名称。
- virtual_router_id:例如51。
- priority:优先级,例如100。
- virtual_ipaddress:虚拟IP地址,例如192.168.26.100/24。
- authentication:访问 keepalived 服务的鉴权信息。
- track_script:HAProxy的健康检查脚本。
keepalived需要持续监控 HAProxy的运行状态,在某个 HAProxy 实例运行不正常时,自动切换到运行正常的 HAProxy 实例上。需要创建一个 HAProxy 健康检查脚本,定期运行该脚本进行监控,例如新建脚本 check-haproxy.sh并将其保存到/usr/bin 目录下。
## /opt/bin/check-haproxy.sh
#!/bin/bashcount=`netstat -apn | grep 9443 | wc -l`if [ $count -gt 0 ]; thenexit 0
elseexit 1
fi
~]# chmod +x /opt/bin/check-haproxy.sh
~]# ls -l /opt/bin/check-haproxy.sh
-rwxr-xr-x 1 root root 111 6月 1 23:24 /opt/bin/check-haproxy.sh
若检查成功,则返回0;若检查失败,则返回非0值。keepalived 根据上面的配置每隔 2s 检査一次 HAProxy 的运行状态。例如,如果检査到第1台主机 192.168.26.31上的HAProxy 为非正常运行状态,keepalived就会将虚拟IP地址切换到正常运行 HAProxy 的第2台主机 192.168.26.32上,保证虚拟 IP地址 192.168.26.100的高可用。
在第2台主机 192.168.26.32上创建配置文件 keepalived.conf:
## /opt/cfg/keepalived.conf - master 2
! Configuration File for keepalivedglobal_defs {router_id LVS_2
}vrrp_script checkhaproxy
{script "/opt/bin/check-haproxy.sh"interval 2weight -30
}vrrp_instance VI_1 {state BACKUPinterface ens160virtual_router_id 51priority 100advert_int 1virtual_ipaddress {192.168.26.100/24 dev ens160}authentication {auth_type PASSauth_pass password}track_script {checkhaproxy}
}
这里与第1个keepalived 配置的主要差异如下:
- vrp_instance 中的 state 被设置为“BACKUP”,这是因为在整个 keepalived 集群中只能有一个被设置为“MASTER"。如果 keepalived集群不止有2个实例,那么除了MASTER,其他都应被设置为“BACKUP”。
- vrp_instance 的值“VI_1”需要与 MASTER 的配置相同,表示它们属于同一个虚拟路由器组,当 MASTER 不可用时,同组的其他 BACKUP实例会自动选举出一个新的 MASTER。
HAProxy健康检査脚本 check-haproxy.sh与第1个keepalived 的相同。
通过 Docker 容器运行 keepalived 且镜像使用 osixia/keepalived:
在两台主机 192.168.26.31和192.168.26.32上启动 keepalived,将配置文件 keepalived.conf挂载到容器的/container/service/keepalived/assets目录下,启动命令如下:
~]# docker pull osixia/keepalived:2.0.20
## 启动keepalived
~]# docker run -d --name k8s-keepalived \--restart=always \--net=host \--cap-add=NET_ADMIN --cap-add=NET_BROADCAST --cap-add=NET_RAW \-v /opt/cfg/keepalived.conf:/container/service/keepalived/assets/keepalived.conf \-v /opt/bin/check-haproxy.sh:/usr/bin/check-haproxy.sh \osixia/keepalived:2.0.20 --copy-service
~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a8525d4d0bb3 osixia/keepalived:2.0.20 "/container/tool/run…" 7 seconds ago Up 6 seconds k8s-keepalived
1c2bec34b643 haproxytech/haproxy-debian:2.3 "/docker-entrypoint.…" 19 minutes ago Up 19 minutes k8s-haproxy
在运行正常的情况下,keepalived会在第1台主机 192.168.26.31的网卡ens160 上设置虚拟 IP地址 192.168.26.100。同样,在第1台主机 192.168.26.31上运行的HAProxy将在该 IP地址上监听 9443 端口号,对需要访问 Kubermetes Master 的客户端提供负载均衡器的人口地址,即192.168.26.100:9443.
通过 ip addr 命令査看主机 192.168.26.31的IP地址,可以看到在 ens160网卡上新增了虚拟 IP 地址 192.168.26.100:
通过cul命令即可验证通过HAProxy的192.168.26.100:9443 地址是否可以访问 kube-apiserver 服务:
## 通过haproxy访问kube-apiserver
# curl -v -k https://192.168.26.100:9443
* Rebuilt URL to: https://192.168.26.100:9443/
* Trying 192.168.26.100...
* TCP_NODELAY set
* Connected to 192.168.26.100 (192.168.26.100) port 9443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/pki/tls/certs/ca-bundle.crtCApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=apiserver
* start date: Jun 1 16:05:16 2025 GMT
* expire date: May 8 16:05:16 2125 GMT
* issuer: CN=192.168.26.63
* SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* Using Stream ID: 1 (easy handle 0x55bbff2a3690)
* TLSv1.3 (OUT), TLS app data, [no content] (0):
> GET / HTTP/2
> Host: 192.168.26.100:9443
> User-Agent: curl/7.61.1
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
< HTTP/2 401
< audit-id: 292539f8-76f4-4072-b56e-55a2aed21963
< cache-control: no-cache, private
< content-type: application/json
< content-length: 157
< date: Mon, 02 Jun 2025 08:32:51 GMT
<
* TLSv1.3 (IN), TLS app data, [no content] (0):
{"kind": "Status","apiVersion": "v1","metadata": {},"status": "Failure","message": "Unauthorized","reason": "Unauthorized","code": 401
* Closing connection 0
* TLSv1.3 (OUT), TLS alert, [no content] (0):
* TLSv1.3 (OUT), TLS alert, close notify (256):
}
可以看到 TCP/IP连接创建成功,说明通过虚拟IP地址192.168.26.100 成功访问到了后端的 kube-apiserver服务。至此,Master 上所需的3个服务就全部启动完成。接下来部署各个Node的服务。
3.4.8. 配置bootstrapping
- 创建bootstrap-kubelet.kubeconfig。
-server=https://192.168.26.100:9443
。在一个节点执行一次即可。
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/cert/ca.crt \
--embed-certs=true --server=https://192.168.26.100:9443 \
--kubeconfig=/opt/cert/bootstrap-kubelet.kubeconfigkubectl config set-credentials tls-bootstrap-token-user \
--token=bc5692.ebcfbe81d917383c \
--kubeconfig=/opt/cert/bootstrap-kubelet.kubeconfigkubectl config set-context tls-bootstrap-token-user@kubernetes \
--cluster=kubernetes \
--user=tls-bootstrap-token-user \
--kubeconfig=/opt/cert/bootstrap-kubelet.kubeconfigkubectl config use-context tls-bootstrap-token-user@kubernetes \
--kubeconfig=/opt/cert/bootstrap-kubelet.kubeconfig## 复制/opt/cert/bootstrap-kubelet.kubeconfig到其它节点
]# scp /opt/cert/bootstrap-kubelet.kubeconfig root@k8s32:/opt/cert/bootstrap-kubelet.kubeconfig
]# scp /opt/cert/bootstrap-kubelet.kubeconfig root@k8s33:/opt/cert/bootstrap-kubelet.kubeconfig
token的位置在bootstrap.secret.yaml(附:yaml文件:bootstrap.secret.yaml),如果修改的话到这个文件修改。
## 创建token。(也可以自已定义)
~]# head -c 16 /dev/urandom | od -An -t x | tr -d ' '
bc5692ebcfbe81d917383c89e60d4388
- bootstrap.secret.yaml
## 修改:
apiVersion: v1
kind: Secret
metadata:name: bootstrap-token-bc5692 ##修改,对应token前6位namespace: kube-system
type: bootstrap.kubernetes.io/token
stringData:description: "The default bootstrap token generated by 'kubelet '."token-id: bc5692 ##修改,对应token前6位token-secret: ebcfbe81d917383c ##修改,对应token前7-22位共16个字符...app]# kubectl create -f bootstrap.secret.yaml --kubeconfig=/opt/cfg/kubeconfig
secret/bootstrap-token-bc5692 created
clusterrolebinding.rbac.authorization.k8s.io/kubelet-bootstrap created
clusterrolebinding.rbac.authorization.k8s.io/node-autoapprove-bootstrap created
clusterrolebinding.rbac.authorization.k8s.io/node-autoapprove-certificate-rotation created
clusterrole.rbac.authorization.k8s.io/system:kube-apiserver-to-kubelet created
clusterrolebinding.rbac.authorization.k8s.io/system:kube-apiserver created
3.5 部署各个 Node 的服务
在 Node 上需要部署容器运行时(如 containerd)、kubelet和 kube-proxy 等系统组件。容器运行时可以根据需要选择合适的软件,例如开源的containerd、cri-o等,相关安装部署过程参考其说明文档。
将 192.168.26.31、192.168.26.32和 192.168.26.33 三台主机部署为 Node,部署一个包含3个Node 的Kubernetes 集群。
3.5.1. 部署kubelet服务(容器为docker)
- 创建目录
~]# mkdir /data/kubernetes/kubelet -p
- 启动文件/usr/lib/systemd/system/kubelet.service。默认使用docker作为Runtime。
cat > /usr/lib/systemd/system/kubelet.service << EOF
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=cri-dockerd.service
Requires=cri-dockerd.service[Service]
WorkingDirectory=/data/kubernetes/kubelet
ExecStart=/usr/bin/kubelet \\--bootstrap-kubeconfig=/opt/cert/bootstrap-kubelet.kubeconfig \\--cert-dir=/opt/cert \\--kubeconfig=/opt/cfg/kubeconfig \\--config=/opt/cfg/kubelet.json \\--container-runtime-endpoint=unix:///var/run/cri-dockerd.sock \\--pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9 \\--root-dir=/data/kubernetes/kubelet \\--v=2
Restart=on-failure
RestartSec=5[Install]
WantedBy=multi-user.target
EOF
/opt/cert/kubelet.kubeconfig为自动创建的文件,如果已存在就删除
- 所有k8s节点创建kubelet的配置文件/opt/cfg/kubelet.json
cat > /opt/cfg/kubelet.json << EOF
{"kind": "KubeletConfiguration","apiVersion": "kubelet.config.k8s.io/v1beta1","authentication": {"x509": {"clientCAFile": "/opt/cert/ca.crt"},"webhook": {"enabled": true,"cacheTTL": "2m0s"},"anonymous": {"enabled": false}},"authorization": {"mode": "Webhook","webhook": {"cacheAuthorizedTTL": "5m0s","cacheUnauthorizedTTL": "30s"}},"address": "192.168.26.31","port": 10250,"readOnlyPort": 10255,"cgroupDriver": "systemd", "hairpinMode": "promiscuous-bridge","serializeImagePulls": false,"clusterDomain": "cluster.local.","clusterDNS": ["168.26.0.2"]
}
EOF
注意修改:“address”: “192.168.26.31”;
“address”: “192.168.26.32”;
“address”: “192.168.26.33”
- 启动
~]# systemctl daemon-reload
~]# systemctl enable --now kubelet
~]# systemctl status kubelet
● kubelet.service - Kubernetes KubeletLoaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)Active: active (running) since Mon 2025-06-02 05:55:58 EDT; 56s agoDocs: https://github.com/kubernetes/kubernetesMain PID: 4746 (kubelet)Tasks: 10 (limit: 23520)Memory: 33.1M
......
## 启动不正常时,查看出错信息:journalctl -fu kubelet
~]# kubectl get nodes -o wide --kubeconfig=/opt/cfg/kubeconfig
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s31.vm.com NotReady <none> 5m33s v1.32.5 192.168.26.31 <none> CentOS Linux 8 4.18.0-348.7.1.el8_5.x86_64 docker://28.2.2
k8s32.vm.com NotReady <none> 5m33s v1.32.5 192.168.26.32 <none> CentOS Linux 8 4.18.0-348.7.1.el8_5.x86_64 docker://28.2.2
k8s33.vm.com NotReady <none> 65s v1.32.5 192.168.26.33 <none> CentOS Linux 8 4.18.0-348.7.1.el8_5.x86_64 docker://28.2.2
## 查看kubelet证书请求(是否有)
~]# kubectl get csr --kubeconfig=/opt/cfg/kubeconfig
...
## 如果处于Pending状态,则批准申请
~]# kubectl --kubeconfig=/opt/cfg/kubeconfig certificate approve node-csr-......
如果node仍然是NotReady
,则需要安装cni-plugin-flannel。(参见后面的核心插件部署)
~]# kubectl --kubeconfig=/opt/cfg/kubeconfig get nodes
NAME STATUS ROLES AGE VERSION
NAME STATUS ROLES AGE VERSION
k8s31.vm.com NotReady <none> 7m56s v1.32.5
k8s32.vm.com NotReady <none> 7m56s v1.32.5
k8s33.vm.com NotReady <none> 3m28s v1.32.5
3.5.2. 部署 kube-proxy 服务
(1) 为 kube-proxy服务创建systemd服务的配置文件/usr/lib/systemd/system/kube-proxy.service:
## /usr/lib/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/kubernetes/kubernetes
After=network.target[Service]
EnvironmentFile=/opt/cfg/proxy
ExecStart=/usr/bin/kube-proxy $KUBE_PROXY_ARGS
Restart=always[Install]
WantedBy=multi-user.target
(2) 配置文件/opt/cfg/proxy 的内容为通过环境变量 KUBE_PROXY_ARGS 设置的 kube-proxy 的全部启动参数:
## /opt/cfg/proxy
KUBE_PROXY_ARGS="--kubeconfig=/opt/cfg/kubeconfig \
--hostname-override=192.168.26.31 \
--proxy-mode=iptables"
- –kubeconfig:设置与API Server连接的客户端身份,可以与kubelet 使用相同的kubeconfig 文件。
- –hostname-override:设置本 Node在集群中的名称,默认值为主机名。
- –proxy-mode:代理模式,可选项包括iptables、ipvs、kernelspace(Windows Node使用)。
(3) 在配置文件准备完毕后,在各个Node 上启动 kube-proxy 服务,并设置为开机自启动,命令如下:
~]# systemctl start kube-proxy && systemctl enable kube-proxy
~]# systemctl status kube-proxy
3.5.3. 在 Master 上通过 kubectl验证各个 Node 的信息
在各个 Node 的 kubelet 和 kube-proxy 服务均正常启动之后,会先将 Node 自动注册到Master上,然后就可以到 Master 上通过 kubectl查询已注册的 Node 的信息,命令如下:
]# kubectl --kubeconfig=/opt/cfg/kubeconfig get nodes
NAME STATUS ROLES AGE VERSION
k8s31.vm.com NotReady <none> 11m v1.32.5
k8s32.vm.com NotReady <none> 11m v1.32.5
k8s33.vm.com NotReady <none> 6m33s v1.32.5
可以看到各个 Node 的状态均为“NotReady”,这是因为还没有部署CNI网络插件,无法设置容器网络。
3.5.4. 安装Calico CNI网络插件
按需选择适合的CNI网络插件进行部署。选择Calico CNI网络插件,则运行以下命令即可一键完成部署:
- 下载 calico 的 yaml 文件(https://docs.projectcalico.org/manifests/calico.yaml)
https://github.com/projectcalico/calico/blob/release-v3.30/manifests/calico.yaml
- 配置 cidr 网段,找到
CALICO_IPV4POOL_CIDR
字段,关闭前面的注释,把 ip 网段修改成和controller-manager
的--cluster-cidr
参数一致
- name: CALICO_IPV4POOL_CIDRvalue: "168.26.0.0/16"
- 拉取镜像
app]# grep image calico.yamlimage: docker.io/calico/cni:v3.30.0imagePullPolicy: IfNotPresentimage: docker.io/calico/cni:v3.30.0imagePullPolicy: IfNotPresentimage: docker.io/calico/node:v3.30.0imagePullPolicy: IfNotPresentimage: docker.io/calico/node:v3.30.0imagePullPolicy: IfNotPresentimage: docker.io/calico/kube-controllers:v3.30.0imagePullPolicy: IfNotPresent
app]# docker pull docker.io/calico/cni:v3.30.0
app]# docker pull docker.io/calico/node:v3.30.0
app]# docker pull docker.io/calico/kube-controllers:v3.30.0
- 创建 calico 组件
app]# kubectl apply -f /opt/app/calico.yaml --kubeconfig=/opt/cfg/kubeconfig
...
在 CNI网络插件成功运行之后,各个Node的状态均会更新为“Ready”:
app]# watch kubectl get pod -A --kubeconfig=/opt/cfg/kubeconfig
...
app]# kubectl --kubeconfig=/opt/cfg/kubeconfig get pod -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-7bfdc5b57c-jt8vj 1/1 Running 0 20m 168.26.139.65 k8s33.vm.com <none> <none>
kube-system calico-node-k9twn 1/1 Running 0 20m 192.168.26.31 k8s31.vm.com <none> <none>
kube-system calico-node-kkx6k 1/1 Running 0 20m 192.168.26.33 k8s33.vm.com <none> <none>
kube-system calico-node-ls2kk 1/1 Running 0 20m 192.168.26.32 k8s32.vm.com <none> <none>
3.5.6 kubectl配置
- 创建admin.kubeconfig。
-server=https://192.168.26.100:9443
。在一个节点执行一次即可。(此处可不用创建,使用**/opt/cfg/kubeconfig**即可)
kubectl config set-cluster kubernetes \--certificate-authority=/opt/cert/ca.crt \--embed-certs=true \--server=https://192.168.26.100:9443 \--kubeconfig=/opt/cert/admin.kubeconfigkubectl config set-credentials kubernetes-admin \--client-certificate=/opt/cert/client.crt \--client-key=/opt/cert/client.key \--embed-certs=true \--kubeconfig=/opt/cert/admin.kubeconfigkubectl config set-context kubernetes-admin@kubernetes \--cluster=kubernetes \--user=kubernetes-admin \--kubeconfig=/opt/cert/admin.kubeconfigkubectl config use-context kubernetes-admin@kubernetes --kubeconfig=/opt/cert/admin.kubeconfig
]# mkdir ~/.kube
### ]# cp /opt/cert/admin.kubeconfig ~/.kube/config
]# cp /opt/cfg/kubeconfig ~/.kube/config
]# scp -r ~/.kube root@k8s32:~/.
]# scp -r ~/.kube root@k8s33:~/.
- 配置kubectl子命令补全
~]# echo 'source <(kubectl completion bash)' >> ~/.bashrc~]# yum -y install bash-completion
~]# source /usr/share/bash-completion/bash_completion
~]# source <(kubectl completion bash)~]# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy ok
~]# kubectl get nodes -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s31.vm.com Ready <none> 159m v1.32.5 192.168.26.31 <none> CentOS Linux 8 4.18.0-348.7.1.el8_5.x86_64 docker://28.2.2
k8s32.vm.com Ready <none> 159m v1.32.5 192.168.26.32 <none> CentOS Linux 8 4.18.0-348.7.1.el8_5.x86_64 docker://28.2.2
k8s33.vm.com Ready <none> 154m v1.32.5 192.168.26.33 <none> CentOS Linux 8 4.18.0-348.7.1.el8_5.x86_64 docker://28.2.2
为了使 Kubernetes 集群内的微服务能够通过服务名进行网络访问,还需要部署kube-dns 服务,建议使用 CoreDNS 来部署 DNS 服务。
至此,一个有3个Master 的高可用 Kubernetes 集群就部署完成,接下来就可以创建 Pod、Deployment、Service 等资源对象来部署和管理容器应用及微服务。
四、 部署CoreDNS服务
修改每个Node上kubelet的 DNS 启动参数,在其中加上以下两个参数。
- –cluster-dns=168.26.0.100:是DNS服务的ClusterIP 地址。
- –cluster-domain=cluster.local:是在DNS 服务中设置的域名。
然后重启 kubelet 服务。
app]# cp coredns.yaml.base coredns.yaml
app]# grep image coredns.yamlimage: registry.k8s.io/coredns/coredns:v1.11.3imagePullPolicy: IfNotPresent
app]# docker pull registry.k8s.io/coredns/coredns:v1.11.3 ## 不能拉取时,使用aliyun镜像
app]# docker pull registry.aliyuncs.com/google_containers/coredns:v1.11.3
app]# docker images|grep coredns
registry.aliyuncs.com/google_containers/coredns v1.11.3 c69fa2e9cbf5 10 months ago 61.8MB
registry.k8s.io/coredns/coredns v1.11.3 c69fa2e9cbf5 10 months ago 61.8MB## 修改以下内容:__DNS__DOMAIN__ 改为: cluster.local__DNS__MEMORY__LIMIT__ 改为: 170Mi__DNS__SERVER__ 改为: 168.26.0.100
## 修改内容位置
...
data:Corefile: |cluster.local {errorshealth {lameduck 5s}readykubernetes cluster.local 168.26.0.0/16 {fallthrough in-addr.arpa ip6.arpa}
...resources:limits:memory: 170Mirequests:cpu: 100mmemory: 70Mi
...
spec:selector:k8s-app: kube-dnsclusterIP: 168.26.0.100
...
通过 kubectl create命令完成 CoreDNS 服务的创建:
app]# kubectl create -f /opt/app/coredns.yaml
查看 Deployment、Pod和Service,确保容器成功启动:
app]# kubectl get deployments --namespace=kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
calico-kube-controllers 1/1 1 1 54m
coredns 1/1 1 1 12s
app]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7bfdc5b57c-jt8vj 1/1 Running 0 54m
calico-node-k9twn 1/1 Running 0 54m
calico-node-kkx6k 1/1 Running 0 54m
calico-node-ls2kk 1/1 Running 0 54m
coredns-f5bd749cf-wk7fc 1/1 Running 0 20s
app]# kubectl get services --namespace=kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 168.26.0.100 <none> 53/UDP,53/TCP,9153/TCP 30s
五、部署Metrics Server
Metrics Server 的主要用途是配合Horizontal Pod Autoscaler与Vertical Pod Autoscaler来实现 Kubernetes 自动伸缩的功能,Metrics Server 具有如下几个特点:
- 部署简单,通过一个 YAML 文件可以在绝大多数系统上一键部署。
- 每隔 15s搜集一次性能指标数据,可以实现Pod的快速自动伸缩控制。
- 轻量级,资源占用量很低,在每个 Node上只需要1m的CPU和 2MiB的内存。
- 可以支持多达 5000个Node的集群。
安装部署很简单,首先从官网下载 Metrics Server 的 YAML 配置文件:
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
或者在浏览器直接输入网址即可下载。
按需修改文件中的配置,其中增加启动参数–kubelet-insecure-tls,表示在访问 kubelet的 HTTPS 协议端口号时不验证 TLS 证书。
...spec:containers:- args:- --cert-dir=/tmp- --secure-port=10250- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname- --kubelet-use-node-status-port- --metric-resolution=15s- --kubelet-insecure-tlsimage: registry.k8s.io/metrics-server/metrics-server:v0.7.2imagePullPolicy: IfNotPresent
...
然后,基于 YAML文件创建 Metrics Server,并等待 Pod 成功启动。
# kubectl create -f components.yaml
...
# kubectl --namespace=kube-system get pods -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-8467fcc7b7-2rxll 1/1 Running 0 41m
接下来,可以通过 kubectl top nodes 和 kubectl top pods 命令监控 Node 和 Pod 的 CPU、内存资源的使用情况:
app]# kubectl top nodes
NAME CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%)
k8s31.vm.com 97m 4% 1762Mi 48%
k8s32.vm.com 99m 4% 1730Mi 47%
k8s33.vm.com 108m 5% 1718Mi 47%
app]# kubectl top pod -A
NAMESPACE NAME CPU(cores) MEMORY(bytes)
default busybox 0m 0Mi
default nginx-web-2b8g8 0m 1Mi
default nginx-web-p8s6x 0m 1Mi
kube-system calico-kube-controllers-7bfdc5b57c-jt8vj 4m 19Mi
kube-system calico-node-k9twn 25m 212Mi
kube-system calico-node-kkx6k 26m 213Mi
kube-system calico-node-ls2kk 25m 215Mi
kube-system coredns-f5bd749cf-wk7fc 1m 72Mi
kube-system metrics-server-8467fcc7b7-2rxll 2m 21Mi