k8s-pod调度
k8s-pod调度
- 一、调度方式与流程
- 二、定向调度
- nodeName
- nodeSelector
- 三、亲和性调度
- nodeAffinity
- podAffinity
- podAntiAffinity
- 四、污点和容忍
- Taints
- Toleration
- 创建一个daemonset控制器,启动一个pod(nginx),在master节点也可以运行
官方文档:https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/kube-scheduler/
一、调度方式与流程
自动调度
:运行在哪个节点上完全由Scheduler经过一系列的算法计算得出定向调度
:NodeName、NodeSelector亲和性调度
:NodeAffinity、PodAffinity、PodAntiAffinity污点(容忍)调度
:Taints、Toleration
调度的核心流程
1.筛选
从集群所有节点中排除不满足 Pod 运行条件的节点,得到候选节点列表
- 节点资源是否充足(CPU、内存等满足 resources.requests)
- 节点是否有污点与 Pod 的容忍度匹配
- 节点选择器或节点亲和性规则是否匹配
- 端口是否冲突、主机名是否匹配等
2.打分
- 对候选节点按优先级排序,选择得分最高的节点。
默认打分策略包括: - 资源利用率
- 节点亲和性权重
- Pod 亲和性 / 反亲和性
- 节点拓扑分布(如跨可用区均衡部署)
二、定向调度
调度是强制的,这就意味着即使要调度的目标Node不存在,也会向上面进行调度,只不过pod运行失败而已
给节点加标签 kubectl label nodes
查看标签 kubectl get node --show-labels
nodeName
1.创建一个命名空间
[root@master sch]# kubectl create namespace sc
2.编写pod的yaml文件
[root@master sch]# cat pod.yaml
apiVersion: v1
kind: Namespace
metadata:name: sc
---
apiVersion: v1
kind: Pod
metadata: name: huangnamespace: sclabels:stu: good
spec:nodeName: k8s-3 # 指定调度到k8s-3节点上containers: - name: huang-nginximage: nginxports:- containerPort: 80name: http-port
3.应用
[root@k8s-1 pod]# kubectl apply -f pod.yaml
pod/huang created
[root@k8s-1 pod]# kubectl get pod -n sc
NAME READY STATUS RESTARTS AGE
huang 1/1 Running 0 13s
[root@k8s-1 pod]# kubectl get pod -n sc -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
huang 1/1 Running 0 18s 10.224.13.82 k8s-3 <none> <none>
nodeSelector
[root@k8s-1 pod]# kubectl label nodes k8s-2 nodeenv=pro
node/k8s-2 labeled
[root@k8s-1 pod]# kubectl label nodes k8s-3 nodeenv=test
node/k8s-3 labeled
[root@k8s-1 pod]# kubectl get node --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-1 Ready control-plane,master 9d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
k8s-2 Ready worker 8d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-2,kubernetes.io/os=linux,node-role.kubernetes.io/worker=worker,nodeenv=pro
k8s-3 Ready worker 8d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-3,kubernetes.io/os=linux,node-role.kubernetes.io/worker=worker,nodeenv=test
创建一个带有节点选择器的 Pod
[root@k8s-1 pod]# cat pod-nodeselector.yaml
apiVersion: v1
kind: Pod
metadata:name: pod-nodeselectornamespace: sc
spec:nodeSelector: nodeenv: procontainers:- name: s-nginximage: nginximagePullPolicy: IfNotPresentports:- containerPort: 80name: http-port
验证调度
[root@k8s-1 pod]# kubectl apply -f pod-nodeselector.yaml
pod/pod-nodeselector created
[root@k8s-1 pod]# kubectl get pods -n sc pod-nodeselector -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-nodeselector 1/1 Running 0 45s 10.224.200.210 k8s-2 <none> <none>[root@k8s-1 pod]# kubectl describe pod -n sc pod-nodeselector
Events:Type Reason Age From Message---- ------ ---- ---- -------Normal Scheduled 71s default-scheduler Successfully assigned sc/pod-nodeselector to k8s-2Normal Pulled 66s kubelet Container image "nginx" already present on machineNormal Created 66s kubelet Created container s-nginxNormal Started 66s kubelet Started container s-nginx
三、亲和性调度
nodeAffinity
以node为目标,解决pod可以调度到哪些node的问题
如果两个应用频繁交互,那就有必要利用亲和性让两个应用的尽可能的靠近,这样可以减少因网络通信而带来的性能损耗
1.给node节点打标签
[root@k8s-1 pod]# kubectl label nodes k8s-3 disktype=ssd
node/k8s-3 labeled
## 删除标签,标签后接-
[root@k8s-1 pod]# kubectl label nodes k8s-3 disktype-
node/k8s-3 unlabeled[root@k8s-1 pod]# kubectl label nodes k8s-3 disktype=ssd
node/k8s-3 labeled
[root@k8s-1 pod]# kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-1 Ready control-plane,master 9d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
k8s-2 Ready worker 9d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-2,kubernetes.io/os=linux,node-role.kubernetes.io/worker=worker,nodeenv=pro
k8s-3 Ready worker 9d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-3,kubernetes.io/os=linux,node-
role.kubernetes.io/worker=worker,nodeenv=test
2.启动pod去选择打了标签的node
[root@k8s-1 pod]# cat pod-nodeaffinity.yaml
apiVersion: v1
kind: Pod
metadata:name: l-nginx
spec:affinity:nodeAffinity:requiredDuringSchedulingIgnoredDuringExecution:nodeSelectorTerms:- matchExpressions:- key: disktypeoperator: Invalues:- ssd containers:- name: nginximage: nginximagePullPolicy: IfNotPresent
[root@k8s-1 pod]# kubectl apply -f pod-nodeaffinity.yaml
pod/l-nginx created
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
l-nginx 1/1 Running 0 113s 10.224.13.106 k8s-3 <none> <none>
podAffinity
以pod为目标,解决pod可以和哪些已存在的pod部署在同一个拓扑域中的问题
创建第一个pod
[root@k8s-1 pod]# cat lyb.yaml
apiVersion: v1
kind: Pod
metadata:name: lyb-nginxlabels: workcity: guangzhouspec:containers:- name: nginximage: nginximagePullPolicy: IfNotPresent
[root@k8s-1 pod]# kubectl apply -f lyb.yaml
pod/lyb-nginx created
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
l-nginx 1/1 Running 0 27m 10.224.13.106 k8s-3 <none> <none>
lyb-nginx 1/1 Running 0 35s 10.224.200.201 k8s-2 <none> <none>
创建第2个pod xa-nginx,使用pod亲和性,调度到lyb-nginx所在node节点
[root@k8s-1 pod]# cat xa.yaml
apiVersion: v1
kind: Pod
metadata:name: xa-nginxlabels: workcity: guangzhou
spec:affinity:podAffinity:requiredDuringSchedulingIgnoredDuringExecution:- labelSelector:matchExpressions:- key: workcityoperator: Invalues: ["shenzhen","changsha","guangzhou"]topologyKey: kubernetes.io/hostnamecontainers:- name: nginximage: nginximagePullPolicy: IfNotPresent
[root@k8s-1 pod]# kubectl apply -f xa.yaml
pod/xa-nginx created
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
l-nginx 1/1 Running 0 27m 10.224.13.106 k8s-3 <none> <none>
lyb-nginx 1/1 Running 0 5m16s 10.224.200.201 k8s-2 <none> <none>
xa-nginx 1/1 Running 0 8s 10.224.200.200 k8s-2 <none> <none>
lyb-nginx Pod 的标签是 workcity: guangzhou,符合 labelSelector 规则,因此 xa-nginx 被调度到了 lyb-nginx 所在的 k8s-2 节点
podAntiAffinity
以pod为目标,解决pod不能和哪些已存在pod部署在同一个拓扑域
中的问题
当应用的采用多副本部署时,有必要采用反亲和性让各个应用实例打散分布在各个
node上,这样可以提高服务的高可用性
[root@k8s-1 pod]# cat c.yaml
apiVersion: v1
kind: Pod
metadata:name: c-nginxlabels: workcity: shanghai
spec:affinity:podAntiAffinity:requiredDuringSchedulingIgnoredDuringExecution:- labelSelector:matchExpressions:- key: workcityoperator: Invalues: ["shanghai","shenzhen","guangzhou"]topologyKey: kubernetes.io/hostnamecontainers:- name: nginximage: nginximagePullPolicy: IfNotPresent
[root@k8s-1 pod]# kubectl apply -f c.yaml
pod/c-nginx created
# 验证
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
l-nginx 1/1 Running 0 27m 10.224.13.106 k8s-3 <none> <none>
c-nginx 1/1 Running 0 6s 10.224.13.84 k8s-3 <none> <none>
lyb-nginx 1/1 Running 0 15m 10.224.200.201 k8s-2 <none> <none>
xa-nginx 1/1 Running 0 10m 10.224.200.200 k8s-2 <none> <none>
四、污点和容忍
污点的格式为: key=value:effect
, key和value是污点的标签,effect描述污点的作用,支持如下三个选项:
PreferNoSchedule
:kubernetes将尽量避免把Pod调度到具有该污点的Node上,除非没有其他节点可调度NoSchedule
:kubernetes将不会把Pod调度到具有该污点的Node上,但不会影响当前Node上已存在的PodNoExecute
:kubernetes将不会把Pod调度到具有该污点的Node上,同时也会将Node上已存在的Pod驱离
Taints
污点理解为一个标签,打了污点标签的节点,调度器在调度pod的时候,尽量不会调度到有污点的节点上,使节点能够排斥一类特定的 Pod
查看污点
[root@k8s-1 pod]#
kubectl describe node k8s-1|grep Taint
Taints: node-role.kubernetes.io/master:NoSchedule
定义一个污点
[root@k8s-1 pod]#
kubectl taint nodes k8s-3 city=beijing:NoSchedule
node/k8s-3 tainted
[root@k8s-1 pod]# kubectl describe node k8s-3|grep Taint
Taints: city=beijing:NoSchedule
[root@k8s-1 pod]# cat pod-4.yaml
apiVersion: v1
kind: Pod
metadata:name: pod-taint-test
spec:containers:- name: nginximage: nginx:latestimagePullPolicy: IfNotPresent
[root@k8s-1 pod]# kubectl apply -f pod-4.yaml
pod/pod-taint-test created
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
l-nginx 1/1 Running 0 27m 10.224.13.106 k8s-3 <none> <none>
c-nginx 1/1 Running 0 13m 10.224.13.107 k8s-3 <none> <none>
lyb-nginx 1/1 Running 0 29m 10.224.200.201 k8s-2 <none> <none>
pod-taint-test 1/1 Running 0 11s 10.224.200.199 k8s-2 <none> <none>
xa-nginx 1/1 Running 0 24m 10.224.200.200 k8s-2 <none> <none>
删除污点
[root@k8s-1 pod]# kubectl taint nodes k8s-3
city=beijing:NoSchedule-
node/k8s-3 untainted
NoExecute使用
[root@k8s-1 pod]# kubectl taint nodes k8s-3 city=fudao:NoExecute
node/k8s-3 tainted
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
l-nginx 1/1 Terminating 0 63m 10.224.13.106 k8s-3 <none> <none>
lyb-nginx 1/1 Running 0 36m 10.224.200.201 k8s-2 <none> <none>
pod-taint-test 1/1 Running 0 7m14s 10.224.200.199 k8s-2 <none> <none>
xa-nginx 1/1 Running 0 31m 10.224.200.200 k8s-2 <none> <none>[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
lyb-nginx 1/1 Running 0 36m 10.224.200.201 k8s-2 <none> <none>
pod-taint-test 1/1 Running 0 7m34s 10.224.200.199 k8s-2 <none> <none>
xa-nginx 1/1 Running 0 31m 10.224.200.200 k8s-2 <none> <none>
[root@k8s-1 pod]# cat pod-4.yaml
apiVersion: v1
kind: Pod
metadata:name: pod-taint-test-2
spec:containers:- name: nginximage: nginx:latestimagePullPolicy: IfNotPresent
[root@k8s-1 pod]# kubectl apply -f pod-4.yaml
pod/pod-taint-test-2 created
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
lyb-nginx 1/1 Running 0 39m 10.224.200.201 k8s-2 <none> <none>
pod-taint-test 1/1 Running 0 10m 10.224.200.199 k8s-2 <none> <none>
pod-taint-test-2 1/1 Running 0 3s 10.224.200.197 k8s-2 <none> <none>
xa-nginx 1/1 Running 0 34m 10.224.200.200 k8s-2 <none> <none>
Toleration
容忍度允许调度器将pod调度到带有对应污点的节点上
[root@k8s-1 pod]# cat pod-5.yaml
apiVersion: v1
kind: Pod
metadata:name: pod-taint-test-tolerations-2
spec:containers:- name: nginximage: nginx:latestimagePullPolicy: IfNotPresenttolerations: - key: "city"operator: "Equal" value: "beijing"effect: "NoSchedule"
[root@k8s-1 pod]# kubectl apply -f pod-5.yaml
pod/pod-taint-test-tolerations-2 created
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
lyb-nginx 1/1 Running 0 44m 10.224.200.201 k8s-2 <none> <none>
pod-taint-test 1/1 Running 0 15m 10.224.200.199 k8s-2 <none> <none>
pod-taint-test-2 1/1 Running 0 4m40s 10.224.200.197 k8s-2 <none> <none>
pod-taint-test-tolerations-2 1/1 Running 0 8s 10.224.13.109 k8s-3 <none> <none>
xa-nginx 1/1 Running 0 39m 10.224.200.200 k8s-2 <none> <none>
禁用节点调度
节点状态从 Ready 变为 Ready,SchedulingDisabled(就绪但禁止调度)
[root@k8s-1 pod]# kubectl cordon k8s-3
node/k8s-3 cordoned
[root@k8s-1 pod]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-1 Ready control-plane,master 9d v1.23.17
k8s-2 Ready worker 9d v1.23.17
k8s-3 Ready,SchedulingDisabled worker 9d v1.23.17
# 恢复
[root@k8s-1 pod]# kubectl uncordon k8s-3
node/k8s-3 uncordoned
驱逐节点所有 Pod
kubectl drain k8s-3 --ignore-daemonsets
忽略 DaemonSet 管理的 Pod(这类 Pod 通常需要在每个节点运行,无法被驱逐)
创建一个daemonset控制器,启动一个pod(nginx),在master节点也可以运行
1.查看master节点有哪些污点,然后去给nginx pod配置容忍
[root@k8s-1 pod]# kubectl describe node k8s-1|grep Taint
Taints: node-role.kubernetes.io/master:NoSchedule
2.使用daemonset控制器去启动nginx pod
[root@k8s-1 pod]# cat daemonset-nginx.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:name: daemonset-nginx
spec:selector:matchLabels:app: dm-nginxtemplate:metadata:labels:app: dm-nginxspec:tolerations: - key: node-role.kubernetes.io/master # # 匹配 master 节点的污点 keyoperator: Exists # 只要污点存在就容忍effect: NoSchedule # 匹配污点的效果- key: cityoperator: Existseffect: NoSchedulecontainers:- name: dm-nginximage: nginx:latestimagePullPolicy: IfNotPresent
[root@k8s-1 pod]# kubectl apply -f daemonset-nginx.yaml
daemonset.apps/daemonset-nginx configured
[root@k8s-1 pod]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
daemonset-nginx-dsmrn 1/1 Running 0 6s 10.224.231.215 k8s-1 <none> <none>
daemonset-nginx-nft4q 1/1 Running 0 10s 10.224.200.213 k8s-2 <none> <none>
daemonset-nginx-z4ljp 1/1 Running 0 14s 10.224.13.110 k8s-3 <none> <none>
lyb-nginx 1/1 Running 0 105m 10.224.200.201 k8s-2 <none> <none>
pod-taint-test 1/1 Running 0 76m 10.224.200.199 k8s-2 <none> <none>
pod-taint-test-2 1/1 Running 0 65m 10.224.200.197 k8s-2 <none> <none>
pod-taint-test-tolerations-2 1/1 Running 0 61m 10.224.13.109 k8s-3 <none> <none>
xa-nginx 1/1 Running 0 100m 10.224.200.200 k8s-2 <none> <none>