当前位置: 首页 > news >正文

用 Ansible 优雅部署 Kubernetes 1.33.3(RedHat 10)

文章目录

    • 1. 目标
    • 2. 快速开始(One Command)
    • 3. 方案与组件
    • 4. 目录结构
    • 5. 样例 inventory
    • 6. 全局变量(可覆盖)
    • 7. 主 Playbook
    • 8. 各角色实现
      • 8.1 preflight(环境检测)
      • 8.2 bootstrap(系统初始化)
      • 8.3 containerd(安装与配置)
      • 8.4 kubernetes(安装 kubelet/kubeadm/kubectl)
      • 8.5 controlplane-init(初始化第一个控制平面)
      • 8.6 node-join(节点加入)
      • 8.7 cni-calico(安装网络)
      • 8.8 addons(常用附加组件)
    • 9. 运维与扩容
    • 10. 离线/内网镜像源
    • 11. 清理/销毁(谨慎)

1. 目标

目标:一条命令拉起生产可用的 K8s 集群(kubeadm + containerd + Calico)。新增/缩容节点只需改 IP 再跑 Playbook。全流程幂等、可离线、可多主 HA。

2. 快速开始(One Command)

准备一台 Ansible 机子(能 SSH 到所有节点)。

修改 inventory.ini 写入节点 IP/主机名。

执行:

ansible-playbook -i inventory.ini site.yml -b \-e "k8s_version=1.33.3 api_endpoint=192.168.31.10 pod_cidr=10.244.0.0/16 svc_cidr=10.96.0.0/12"

单主:api_endpoint 填第一个 master 的 IP;多主:填 VIP/域名(需提前准备 LB/VIP)。

扩容:给 inventory.ini 添加节点,再跑同一条命令即可(自动生成/复用 join 脚本)。

3. 方案与组件

  • OS:RedHat 10(systemd、dnf、SELinux 可用)

  • CRI:containerd(systemd cgroup)

  • Kubernetes:kubeadm/kubelet/kubectl v1.33.3

  • CNI:Calico(默认 Pod CIDR 10.244.0.0/16)

  • Service CIDR:10.96.0.0/12

  • 时间服务:chrony

  • 可选:多主 HA(外置 HAProxy/Keepalived)

  • 特性:幂等、离线/镜像仓库可切换、自动生成 join 命令、自动 sysctl、swap/防火墙策略可控

4. 目录结构

ansible-k8s/
├─ inventory.ini
├─ group_vars/
│ └─ all.yml
├─ site.yml
├─ roles/
│ ├─ preflight/
│ │ └─ tasks/main.yml
│ ├─ bootstrap/
│ │ └─ tasks/main.yml
│ ├─ containerd/
│ │ ├─ tasks/main.yml
│ │ └─ templates/config.toml.j2
│ ├─ kubernetes/
│ │ ├─ tasks/main.yml
│ │ └─ templates/kubernetes.repo.j2
│ ├─ controlplane-init/
│ │ └─ tasks/main.yml
│ ├─ node-join/
│ │ └─ tasks/main.yml
│ ├─ cni-calico/
│ │ └─ tasks/main.yml
│ └─ addons/
│ └─ tasks/main.yml
└─ artifacts/ # 自动生成 join 脚本/临时文件

5. 样例 inventory

# inventory.ini
[all:vars]
ansible_user=root
ansible_ssh_common_args='-o StrictHostKeyChecking=no'[masters]
master01 ansible_host=192.168.31.101
# master02 ansible_host=192.168.31.102
# master03 ansible_host=192.168.31.103[workers]
worker01 ansible_host=192.168.31.201
worker02 ansible_host=192.168.31.202[k8s:children]
masters
workers

6. 全局变量(可覆盖)

# group_vars/all.yml
k8s_version: "1.33.3"
cri: "containerd"
pod_cidr: "10.244.0.0/16"
svc_cidr: "10.96.0.0/12"
api_endpoint: "192.168.31.101" # 单主=master01 IP,多主=VIP/域名
cluster_name: "prod-cluster"
timezone: "Asia/Shanghai"# 镜像/仓库(可切内网)
image_repo: "registry.k8s.io"
calico_version: "v3.27.2"# 系统策略
swap_disable: true
selinux_state: permissive # enforcing/permissive/disabled
disable_firewalld: true

7. 主 Playbook

# site.yml
- hosts: k8s
gather_facts: yes
become: yes
vars_files:
- group_vars/all.yml
pre_tasks:
- name: Only RedHat family supported
fail: { msg: "Only RedHat-like systems supported" }
when: ansible_os_family != 'RedHat'
roles:
- preflight
- bootstrap
- containerd
- kubernetes- hosts: masters[0]
become: yes
roles:
- controlplane-init- hosts: masters[1:]
become: yes
roles:
- role: node-join
vars: { join_type: controlplane }- hosts: workers
become: yes
roles:
- role: node-join
vars: { join_type: worker }- hosts: masters[0]
become: yes
roles:
- cni-calico
- addons

8. 各角色实现

8.1 preflight(环境检测)

# roles/preflight/tasks/main.yml
- name: Check CPU/Memory/Disk
assert:
that:
- ansible_memtotal_mb | int >= 2048
- ansible_processor_vcpus | int >= 2
fail_msg: "Need >=2 vCPU and >=2G RAM"- name: Check kernel modules available
shell: |
modprobe br_netfilter && modprobe overlay
changed_when: false
failed_when: false- name: Ensure outbound connectivity
shell: curl -sSf https://kubernetes.io >/dev/null
changed_when: false
failed_when: false

8.2 bootstrap(系统初始化)

# roles/bootstrap/tasks/main.yml
- name: Set timezone
command: timedatectl set-timezone {{ timezone }}
changed_when: false- name: Install base packages
package:
name:
- chrony
- iproute
- iptables
- socat
- conntrack-tools
- ethtool
- ebtables
- curl
- jq
- bash-completion
state: present- name: Enable & start chronyd
service: { name: chronyd, state: started, enabled: yes }- name: Disable swap runtime
command: swapoff -a
when: swap_disable- name: Remove swap from fstab
replace:
path: /etc/fstab
regexp: '^([^#].*\sswap\s)'
replace: '# \1'
when: swap_disable- name: Set SELinux state
selinux: { policy: targeted, state: {{ selinux_state }} }- name: Stop/disable firewalld when requested
service: { name: firewalld, state: stopped, enabled: no }
when: disable_firewalld- name: Write modules-load
copy:
dest: /etc/modules-load.d/k8s.conf
content: |
br_netfilter
overlay- name: Load modules
modprobe: { name: "{{ item }}", state: present }
loop: [ 'br_netfilter', 'overlay' ]- name: Sysctl for Kubernetes
copy:
dest: /etc/sysctl.d/99-kubernetes-cri.conf
content: |
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1- name: Apply sysctl
command: sysctl --system

8.3 containerd(安装与配置)

# roles/containerd/tasks/main.yml
- name: Install containerd
package: { name: containerd, state: present }- name: Generate default config
command: containerd config default
register: cdef
changed_when: false- name: Write config.toml
template: { src: templates/config.toml.j2, dest: /etc/containerd/config.toml }- name: Enable & start containerd
service: { name: containerd, state: started, enabled: yes }# roles/containerd/templates/config.toml.j2
version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "{{ image_repo }}/pause:3.10"
systemd_cgroup = true
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]

8.4 kubernetes(安装 kubelet/kubeadm/kubectl)

# roles/kubernetes/tasks/main.yml
- name: Add Kubernetes repo
template: { src: templates/kubernetes.repo.j2, dest: /etc/yum.repos.d/kubernetes.repo }- name: Install kube components
package:
name:
- kubelet-{{ k8s_version }}
- kubeadm-{{ k8s_version }}
- kubectl-{{ k8s_version }}
state: present- name: Enable kubelet
service: { name: kubelet, state: stopped, enabled: yes }# roles/kubernetes/templates/kubernetes.repo.j2
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v{{ k8s_version }}/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v{{ k8s_version}}/rpm/repodata/repomd.xml.key

8.5 controlplane-init(初始化第一个控制平面)

# roles/controlplane-init/tasks/main.yml
- name: Create kubeadm dir
file: { path: /etc/kubernetes/kubeadm, state: directory, mode: '0755' }- name: Render kubeadm config
copy:
dest: /etc/kubernetes/kubeadm/config.yaml
content: |
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: v{{ k8s_version }}
clusterName: {{ cluster_name }}
networking:
podSubnet: {{ pod_cidr }}
serviceSubnet: {{ svc_cidr }}
controlPlaneEndpoint: {{ api_endpoint }}:6443
apiServer:
extraArgs:
authorization-mode: Node,RBAC
imageRepository: {{ image_repo }}
certificatesDir: /etc/kubernetes/pki
dns: { type: CoreDNS }
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock
kubeletExtraArgs: { cgroup-driver: systemd }- name: kubeadm init (idempotent)
command: kubeadm init --config=/etc/kubernetes/kubeadm/config.yaml
args: { creates: /etc/kubernetes/admin.conf }- name: Setup kubeconfig for root
command: bash -lc 'mkdir -p ~/.kube && cp -f /etc/kubernetes/admin.conf ~/.kube/config && chown $(id -u):$(id -g) ~/.kube/config'- name: Create artifacts dir
file: { path: /root/ansible-k8s-artifacts, state: directory }- name: Generate join scripts
shell: |
kubeadm token create --print-join-command > /root/ansible-k8s-artifacts/join-worker.sh
CK=$(kubeadm certs certificate-key)
kubeadm token create --print-join-command --ttl 24h0m0s | \
sed 's/kubeadm join /kubeadm join --control-plane /' > /root/ansible-k8s-artifacts/join-controlplane.sh
args: { executable: /bin/bash }- name: Fetch join scripts
fetch:
src: "/root/ansible-k8s-artifacts/{{ item }}"
dest: "artifacts/{{ item }}"
flat: yes
loop: [ 'join-worker.sh', 'join-controlplane.sh' ]

8.6 node-join(节点加入)

# roles/node-join/tasks/main.yml
- name: Distribute join script
copy:
src: "artifacts/join-{{ 'controlplane' if join_type == 'controlplane' else 'worker' }}.sh"
dest: /tmp/join.sh
mode: '0755'- name: Join cluster if not already joined
shell: |
if ! test -f /etc/kubernetes/kubelet.conf; then /tmp/join.sh; else echo already joined; fi
args: { executable: /bin/bash }

8.7 cni-calico(安装网络)

# roles/cni-calico/tasks/main.yml
- name: Download Calico manifest
get_url:
url: "https://raw.githubusercontent.com/projectcalico/calico/{{ calico_version }}/manifests/calico.yaml"
dest: /root/calico.yaml- name: Replace default Pod CIDR when needed
replace:
path: /root/calico.yaml
regexp: '192.168.0.0/16'
replace: '{{ pod_cidr }}'- name: Apply Calico
command: kubectl apply -f /root/calico.yaml
environment: { KUBECONFIG: /etc/kubernetes/admin.conf }

8.8 addons(常用附加组件)

# roles/addons/tasks/main.yml
- name: Metrics Server
command: kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
environment: { KUBECONFIG: /etc/kubernetes/admin.conf }
ignore_errors: yes- name: kubectl completion
lineinfile:
path: /root/.bashrc
line: 'source <(kubectl completion bash)'
create: yes

9. 运维与扩容

查看集群:kubectl get nodes -o wide

查看核心组件:kubectl -n kube-system get po

续期控制面证书(主节点):kubeadm certs renew all

扩容:把新 IP 加到 inventory.ini 的相应组,重跑 ansible-playbook 即可

变更 Pod/Service CIDR:需先清理网络组件并重建(规划期一次定好)

10. 离线/内网镜像源

将 image_repo 和 kubernetes.repo.j2 指向你的私有仓库/RPM 源

预拉镜像列表:kubeadm config images list --kubernetes-version v1.33.3

可在 containerd 角色里增加镜像加速/registry mirror

11. 清理/销毁(谨慎)

kubeadm reset -f
systemctl disable --now kubelet containerd
rm -rf /etc/kubernetes /var/lib/etcd /var/lib/kubelet /etc/cni /opt/cni /var/lib/containerd

在这里插入图片描述


“人的一生会经历很多痛苦,但回头想想,都是传奇”。


http://www.dtcms.com/a/347205.html

相关文章:

  • 第一章:启航篇 —— 新晋工程师的生存与扎根 (1)
  • TensorFlow 深度学习 开发环境搭建
  • 通过Java连接并操作MySQL数据库
  • 多智能体篇:智能体的“语言”——ACL协议与消息队列实现
  • 高斯分布的KL散度计算
  • STM32学习笔记19-FLASH
  • 标准浪涌测试波形对比解析
  • linux内核 - vmalloc 介绍
  • Unity 字符串输出文字一样但Equals 判断为false
  • 图论与最短路学习笔记
  • CH2 线性表
  • LeetCode 分类刷题:2529. 正整数和负整数的最大计数
  • IDEA控制台乱码(Tomcat)解决方法
  • 2-4.Python 编码基础 - 流程控制(判断语句、循环语句、break 语句与 continue 语句)
  • MySQL存储过程详解
  • `strlen` 字符串长度函数
  • GEO优化服务:智能时代的全球竞争新赛道
  • VS Code 中创建和开发 Spring Boot 项目
  • python企微发私信
  • Text2API与Text2SQL深度对比:自然语言驱动的数据交互革命
  • 【40页PPT】数据安全动态数据脱敏解决方案(附下载方式)
  • C/C++ 头文件命名约定
  • stack,queue以及deque的介绍
  • 【Java学习笔记】18.反射与注解的应用
  • [e3nn] 模型部署 | TorchScript JIT | `@compile_mode`装饰器 | Cython
  • TypeScript的构造函数constructor用法理解
  • 深入理解Java虚拟机:JVM高级特性与最佳实践(第3版)第四章知识点问答补充及重新排版
  • 离线优先与冲突解决:ABP vNext + PWA 的边缘同步
  • SQL Server更改日志模式:操作指南与最佳实践!
  • 使用 Certbot 申请 Apache 证书配置棘手问题