描述
发现 ceph 集群有两个 pg 一直处于 active+undersized+degraded 状态 当前 pg 1.9d0 1.be4 只有两个副本, 无法完成三副本自动恢复 故障在某个 osd 故障后就一直出现 当前集群整体使用率为 50% OSD 最大使用率为 65%
pg 1.9d0 is stuck undersized for 115716.952775, current state active+undersized+degraded, last acting [113,1866]pg 1.be4 is stuck undersized for 115716.952775, current state active+undersized+degraded, last acting [2028,1179]
ceph 集群为 3 副本设定 有大量 ceph osd 低于集群平均使用率
尝试恢复手段
下面方法都尝试过,但没有解决问题 重启 pg 当前对应两个 osd ceph pg repair xxx ceph pg scrub xxx ceph pg deep-scrub xxx
当前 pg 状态信息
{"state": "active+undersized+degraded","snap_trimq": "[]","snap_trimq_len": 0,"epoch": 223596,"up": [ <- 预期 OSD113,1866],"acting": [ <- 当前 OSD 113,1866],"actingbackfill": [ <- 没有任何 backfill 现象发生"113","1866"],
....."blocked_by": [], <--- 没有 blcok "up_primary": 113,"acting_primary": 113},
可以看到当前 pg 状态处于降级 没有任何问题 BLOCK pg 操作 无法自动 crush map 计算,令 pg 执行 recover 恢复为三副本
故障原因
经过资料搜索,当集群中磁盘大小不均衡,会出现 crush 无法计算获取第三副本现象 当前 crush map 现状如下
3180.00000 rack 3F-302-A-05
3932.99927 rack 3F-302-B-04
3496.19946 rack 3F-302-B-26
3346.00000 rack 3F-302-C-07
3408.00000 rack 3F-302-D-13
2402.39795 rack 6F-602-C-23
3499.79932 rack 6F-602-G-01
873.69989 rack 6F-602-L-14
952.99988 rack 7F-02-C07
700.00000 rack 7F-02-E11
702.00000 rack 7F-02-E19
700.00000 rack 7F-02-F03
702.00000 rack 7F-02-F20
880.00000 rack 7F-702-L-18
很明显,机柜容量差异会有 3 ~ 5 倍数 后续可能会带来 ioblock, pg 不均衡等风险问题
恢复方法
3180.00000 rack 3F-302-A-05 <--- 源 osd osd.1179
3932.99927 rack 3F-302-B-04
3496.19946 rack 3F-302-B-26
3346.00000 rack 3F-302-C-07
3408.00000 rack 3F-302-D-13
2402.39795 rack 6F-602-C-23
3499.79932 rack 6F-602-G-01
873.69989 rack 6F-602-L-14
952.99988 rack 7F-02-C07
700.00000 rack 7F-02-E11
702.00000 rack 7F-02-E19
700.00000 rack 7F-02-F03
702.00000 rack 7F-02-F20
880.00000 rack 7F-702-L-18 <- 目标 osd osd.723
ceph osd pg-upmap-items 1.be4 osd.1179 osd.723
结果
迁移后 pg 可以恢复至 3 副本 参考执行命令后 pg info
"up": [ <- 迁移后 pg 恢复至 3 副本1046,1990,723],"acting": [ <- 当前 pg 状态1179,2028],"backfill_targets": [ <- 迁移目标"723","1046","1990"],"actingbackfill": [ <- 发生 pg 迁移相关的 pg "723","1046","1179","1990","2028"],