kuboard自带ETCD存储满了处理方案
一、前言
当运行 ETCD 日志报 Erro: mvcc database space exceeded 时,说明 ETCD 存储不足了(默认 ETCD 存储是 2G),配额会触发告警,然后 Etcd 系统将进入操作受限的维护模式。
通过下面命令可以查看 ETCD 存储使用情况:
$ ETCDCTL_API=3 etcdctl --endpoints="http://127.0.0.1:2379" --write-out=table endpoint status
[root@Jenkins ~]# cd kuboard-data/etcd-data/member/snap/
[root@Jenkins snap]# ll -h
总用量 1.9G
-rw-r--r-- 1 root root 9.4K 10月 11 17:00 0000000000000009-0000000000557338.snap
-rw-r--r-- 1 root root 9.4K 10月 20 10:44 0000000000000009-000000000056f9d9.snap
-rw-r--r-- 1 root root 9.4K 10月 29 04:29 0000000000000009-000000000058807a.snap
-rw-r--r-- 1 root root 9.4K 11月 6 22:21 0000000000000009-00000000005a071b.snap
-rw-r--r-- 1 root root 9.4K 11月 15 16:07 0000000000000009-00000000005b8dbc.snap
-rw------- 1 root root 2G 11月 23 11:39 db
二、临时解决方案
❝PS: 压缩前做好快照备份,命令 etcdctl snapshot save backup.db ❞
通过 ETCD 数据压缩来临时解决问题,具体如下操作
# 获取当前版本
$ rev=$(ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9].*')
# 压缩所有旧版本
$ ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 compact $rev
# 整理多余的空间
$ ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 defrag
# 取消告警信息
$ ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 alarm disarm
# 测试是否能成功写入
$ ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 put testkey 123
OK
# 再次查看ETCD存储使用情况
$ ETCDCTL_API=3 etcdctl --endpoints="http://127.0.0.1:2379" --write-out=table endpoint status
详细过程及其输出如下:
root@a2caf8010e75:/# etcdctl snapshot save backup.db
{"level":"info","ts":1695447648.315712,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"backup.db.part"}
{"level":"info","ts":"2023-09-23T13:40:48.317+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1695447648.3172774,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"127.0.0.1:2379"}
{"level":"info","ts":"2023-09-23T13:41:03.646+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1695447663.8131642,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"127.0.0.1:2379","size":"2.1 GB","took":15.497392681}
{"level":"info","ts":1695447663.8132935,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"backup.db"}
Snapshot saved at backup.db
root@a2caf8010e75:/# rev=$(ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9].*')
root@a2caf8010e75:/# ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 compact $rev
compacted revision 6077603
root@a2caf8010e75:/# ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 defrag
Finished defragmenting etcd member[http://127.0.0.1:2379]
root@a2caf8010e75:/# ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 alarm disarm
memberID:6460912315094810421 alarm:NOSPACE
root@a2caf8010e75:/# ETCDCTL_API=3 etcdctl --endpoints="http://127.0.0.1:2379" --write-out=table endpoint status
+-----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://127.0.0.1:2379 | 59a9c584ea2c3f35 | 3.4.14 | 127 kB | true | false | 6 | 6089454 | 6089454 | |
+-----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
root@a2caf8010e75:/#
————————————————
三、最终解决方案
在 ETCD 启动命令中添加下面两个参数:
# 表示每隔一个小时自动压缩一次
--auto-compaction-retention=1
# 磁盘空间调整为 8G,官方建议最大 8G(单位是字节)
--quota-backend-bytes=8388608000
四、最佳实践
大家有没有使用过 Kuboard(Kubernetes 多集群管理界面,官网地址:https://kuboard.cn),如果有使用过的同学可能会遇到 ETCD 存储不足的问题,因为官网提供的 docker 镜像中,ETCD 启动参数并没有添加 --auto-compaction-retention 和 --quota-backend-bytes 参数。
修改官网 Kuboard docker 镜像 /entrypoint.sh 启动脚本
生成 Dockerfile 文件:
# 编辑 Dockerfile
$ vim Dockerfile
FROM eipwork/kuboard:v3.5.0.3
COPY ./entrypoint.sh /entrypoint.sh
# 构建镜像
$ docker build -t eipwork/kuboard-modify:v3.5.0.3 . -f Dockerfile
启动 Kuboard,并查看进程如下:
五、参考文档
- https://etcd.io/docs/v3.4/op-guide/maintenance/