ceph 数据落盘异常问题分析
应用架构
挖补用户 -- 请求 --> 自研 s3 程序 -- 请求 --> ceph mon -- 数据写入 --> ceph rados
故障概述
故障来源:
自研 S3 程序
核心错误:
Ceph Rados 写入报错 Failed to get aio return value; ETIMEDOUT: Connection timed out (-110)
故障分析过程
初步分析
- 错误类型通常为连接 MON 超时或 OSD 数据交互超时
- 通过 atop和 iostat工具发现 OSD 对应硬盘存在 util、await指标繁忙
- 集群中所有服务器均出现类似繁忙现象
- 当前负载情况:1000-2000 OPS,200-400MB/s 数据写入(理论上集群应处于最优状态)
基础硬件检测
使用命令检测硬盘落盘性能:
sync; dd if=/dev/zero of=$i/1.img bs=1G count=1 oflag=direct conv=fdatasync
检测结果:每个硬盘写入速度约 200-300MB/s,属于正常范围
Ceph 集群深入检查
OSD 操作状态检查
- dump_blocked_ops:正常,无阻塞操作
- dump_ops_in_flight:正常,无异常操作
- dump_historic_ops:发现关键异常
只选择说明三个操作作为例子
{"description": "osd_op(client.283785238.0:309800226 6.923d09cd smpro-file-calced\/v2\/AZxKZdPlW2OpI4rQ\/ec40e7cb1a77a49479c79d45fe91f66809d63ba3\/QKr33\/0-202.md5_61f88f7f.a6a0be [] snapc 0=[] ondisk+write+known_if_redirected e258985)","initiated_at": "2025-10-11 14:47:24.068942","age": 562.528570,"duration": 11.586828,"type_data": ["commit sent; apply or cleanup",{"client": "client.283785238","tid": 309800226},[{"time": "2025-10-11 14:47:24.068942","event": "initiated"},{"time": "2025-10-11 14:47:24.069499","event": "queued_for_pg"},{"time": "2025-10-11 14:47:24.069532","event": "reached_pg"},{"time": "2025-10-11 14:47:26.495084","event": "started"},{"time": "2025-10-11 14:47:26.495180","event": "waiting for subops from 659,670"},{"time": "2025-10-11 14:47:26.495460","event": "commit_queued_for_journal_write"},{"time": "2025-10-11 14:47:26.495483","event": "write_thread_in_journal_buffer"},{"time": "2025-10-11 14:47:26.496388","event": "journaled_completion_queued"},{"time": "2025-10-11 14:47:26.496412",