当前位置: 首页 > news >正文

Oracle RAC 再遇 MTU 坑:cssd 无法启动!

前言

最近部署了一套 Oracle 19C RAC 环境,安装完成后关机打包发往异地机房。上架后发现其中一个节点无法正常启动,集群卡在 cssd 资源启动阶段。心跳网络采用光纤连接,通过交换机的独立 VLAN,节点间可以正常 ping 通,但是 RAC 只能单节点运行。经过深入分析,最终定位到是 MTU 配置问题导致的故障。

本文记录详细的问题分析过程和解决方案。

问题现象

Oracle RAC 中一个节点启动失败,集群状态显示:

[root@orcl02:/root]# crsctl stat res -t -init
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm1        ONLINE  OFFLINE                               STABLE
ora.cluster_interconnect.haip1        ONLINE  OFFLINE                               STABLE
ora.crf1        ONLINE  ONLINE       orcl02             STABLE
ora.crsd1        ONLINE  OFFLINE                               STABLE
ora.cssd1        ONLINE  OFFLINE      orcl02             STARTING
ora.cssdmonitor1        ONLINE  ONLINE       orcl02             STABLE
ora.ctssd1        ONLINE  OFFLINE                               STABLE
ora.diskmon1        OFFLINE OFFLINE                               STABLE
ora.drivers.acfs1        ONLINE  ONLINE       orcl02             STABLE
ora.evmd1        ONLINE  INTERMEDIATE orcl02             STABLE
ora.gipcd1        ONLINE  ONLINE       orcl02             STABLE
ora.gpnpd1        ONLINE  ONLINE       orcl02             STABLE
ora.mdnsd1        ONLINE  ONLINE       orcl02             STABLE
ora.storage1        ONLINE  OFFLINE                               STABLE

集群在 cssd 资源启动阶段 HANG 住。

检查 CRS 日志 ($ORACLE_BASE/diag/crs/orcl02/crs/trace/alert.log):

2025-11-11 12:21:57.020 [GIPCD(127382)]CRS-7517: The Oracle Grid Interprocess Communication (GIPC) failed to identify the Fast Node Death Detection (FNDD).2025-11-11 12:23:07.241 [OCSSD(130761)]CRS-1621: The IPMI configuration data for this node stored in the Oracle registry is incomplete; details at (:CSSNK00002:) in /u01/app/grid/diag/crs/orcl02/crs/trace/ocssd.trc
2025-11-11 12:23:07.241 [OCSSD(130761)]CRS-1617: The information required to do node kill for node orcl02 is incomplete; details at (:CSSNM00004:) in /u01/app/grid/diag/crs/orcl02/crs/trace/ocssd.trc
2025-11-11 12:23:37.324 [OCSSD(130761)]CRS-7500: The Oracle Grid Infrastructure process 'ocssd' failed to establish Oracle Grid Interprocess Communication (GIPC) high availability connection with remote node 'orcl01'.2025-11-11 12:28:39.020 [OCSSD(130761)]CRS-7500: The Oracle Grid Infrastructure process 'ocssd' failed to establish Oracle Grid Interprocess Communication (GIPC) high availability connection with remote node 'orcl01'.2025-11-11 12:32:58.470 [OCSSD(130761)]CRS-1609: This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00086:) in /u01/app/grid/diag/crs/orcl02/crs/trace/ocssd.trc.
2025-11-11 12:32:58.469 [CSSDAGENT(130651)]CRS-5818: Aborted command 'start' for resource 'ora.cssd'. Details at (:CRSAGF00113:) {0:5:4} in /u01/app/grid/diag/crs/orcl02/crs/trace/ohasd_cssdagent_root.trc.
2025-11-11 12:32:58.506 [OHASD(126708)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.cssd'. Details at (:CRSPE00221:) {0:5:4} in /u01/app/grid/diag/crs/orcl02/crs/trace/ohasd.trc.
2025-11-11 12:32:59.470 [OCSSD(130761)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/grid/diag/crs/orcl02/crs/trace/ocssd.trc
2025-11-11 12:32:59.470 [OCSSD(130761)]CRS-1603: CSSD on node orcl02 has been shut down.
2025-11-11 12:33:00.151 [OCSSD(130761)]CRS-1609: This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00086:) in /u01/app/grid/diag/crs/orcl02/crs/trace/ocssd.trc.
2025-11-11T12:33:04.480316+08:00
Errors in file /u01/app/grid/diag/crs/orcl02/crs/trace/ocssd.trc  (incident=17):
CRS-8503 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /u01/app/grid/diag/crs/orcl02/crs/incident/incdir_17/ocssd_i17.trc2025-11-11 12:33:04.471 [OCSSD(130761)]CRS-8503: Oracle Clusterware process OCSSD with operating system process ID 130761 experienced fatal signal or exception code 6.

日志明确显示节点 orcl02 无法与节点 orcl01 建立 GIPC 高可用连接。

进一步分析 cssd 日志 ($ORACLE_BASE/diag/crs/orcl02/crs/trace/ocssd.trc):

2025-11-11 13:07:10.563 :    CSSD:909854464: [     INFO] clssnmvDHBValidateNCopy: node 1, orcl01, has a disk HB, but no network HB, DHB has rcfg 658329638, wrtcnt, 36406, LATS 5334194, lastSeqNo 36403, uniqueness 1762834827, timestamp 1762837626/5323124
2025-11-11 13:07:10.564 :    CSSD:897136384: [     INFO] clssscSelect: gipcwait returned with status gipcretTimeout (16)

日志表明心跳网络存在通信问题。

问题分析

初步排查

首先检查基础网络连通性:

[root@orcl02:/tmp/mcasttest]# ping orcl01-priv
PING orcl01-priv (1.1.1.1) 56(84) bytes of data.
64 bytes from orcl01-priv (1.1.1.1): icmp_seq=1 ttl=64 time=0.053 ms
64 bytes from orcl01-priv (1.1.1.1): icmp_seq=2 ttl=64 time=0.044 ms
64 bytes from orcl01-priv (1.1.1.1): icmp_seq=3 ttl=64 time=0.116 ms

心跳 IP 可以正常 ping 通,防火墙已关闭。

MOS 文档参考

查询 MOS 文档,发现类似案例:OCI DBCS : Failed to start CRS on first RAC node - (GIPC) failed to identify the Fast Node Death Detection (FNDD). (Doc ID 2969313.1)

该案例指出问题根源在于节点间 MTU 配置不一致:

MTU 配置检查

检查两个节点的 MTU 配置:

## 节点1
[root@orcl01:/home/grid]# ifconfig bond1|grep mtu
bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 9000## 节点2
[root@orcl02:/home/grid]$ ifconfig bond1|grep mtu
bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 9000

节点间的 MTU 配置一致,均为 9000。

CVU 检测

使用 Oracle Cluster Verification Utility (CVU) 验证一下 RAC 集群节点间的网络连通性:

[grid@orcl01:/home/grid]$ cluvfy comp nodecon -n all -verbosePerforming following verification checks ...Node Connectivity ...Hosts File ...Node Name                             Status                  ------------------------------------  ------------------------orcl01                          passed                  orcl02                          passed                  Hosts File ...PASSEDInterface information for node "orcl02"Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU   ------ --------------- --------------- --------------- --------------- ----------------- ------bond0  192.168.6.206   192.168.6.0     0.0.0.0         192.168.6.1     8C:84:74:75:EF:00 1500  bond1  1.1.1.2   1.1.1.0    0.0.0.0         192.168.6.1     8C:84:74:DA:B8:E0 9000  Interface information for node "orcl01"Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU   ------ --------------- --------------- --------------- --------------- ----------------- ------bond0  192.168.6.205   192.168.6.0     0.0.0.0         192.168.6.1     8C:84:74:75:F2:00 1500  bond0  192.168.6.207   192.168.6.0     0.0.0.0         192.168.6.1     8C:84:74:75:F2:00 1500  bond0  192.168.6.209   192.168.6.0     0.0.0.0         192.168.6.1     8C:84:74:75:F2:00 1500  bond0  192.168.6.208   192.168.6.0     0.0.0.0         192.168.6.1     8C:84:74:75:F2:00 1500  bond1  1.1.1.1   1.1.1.0    0.0.0.0         192.168.6.1     8C:84:74:DA:D3:10 9000  Check: MTU consistency on the private interfaces of subnet "1.1.1.0"Node              Name          IP Address    Subnet        MTU             ----------------  ------------  ------------  ------------  ----------------orcl02      bond1         1.1.1.2  1.1.1.0  9000            orcl01      bond1         1.1.1.1  1.1.1.0  9000            Check: MTU consistency of the subnet "192.168.6.0".Node              Name          IP Address    Subnet        MTU             ----------------  ------------  ------------  ------------  ----------------orcl02      bond0         192.168.6.206  192.168.6.0   1500            orcl01      bond0         192.168.6.205  192.168.6.0   1500            orcl01      bond0         192.168.6.207  192.168.6.0   1500            orcl01      bond0         192.168.6.209  192.168.6.0   1500            orcl01      bond0         192.168.6.208  192.168.6.0   1500            Source                          Destination                     Connected?      ------------------------------  ------------------------------  ----------------orcl01[bond0:192.168.6.205]  orcl02[bond0:192.168.6.206]  yes             orcl01[bond0:192.168.6.205]  orcl01[bond0:192.168.6.207]  yes             orcl01[bond0:192.168.6.205]  orcl01[bond0:192.168.6.209]  yes             orcl01[bond0:192.168.6.205]  orcl01[bond0:192.168.6.208]  yes             orcl02[bond0:192.168.6.206]  orcl01[bond0:192.168.6.207]  yes             orcl02[bond0:192.168.6.206]  orcl01[bond0:192.168.6.209]  yes             orcl02[bond0:192.168.6.206]  orcl01[bond0:192.168.6.208]  yes             orcl01[bond0:192.168.6.207]  orcl01[bond0:192.168.6.209]  yes             orcl01[bond0:192.168.6.207]  orcl01[bond0:192.168.6.208]  yes             orcl01[bond0:192.168.6.209]  orcl01[bond0:192.168.6.208]  yes             Source                          Destination                     Connected?      ------------------------------  ------------------------------  ----------------orcl01[bond1:1.1.1.1]  orcl02[bond1:1.1.1.2]  yes             Check that maximum (MTU) size packet goes through subnet ...FAILED (PRVG-12885, PRVG-12884, PRVG-2043)subnet mask consistency for subnet "192.168.6.0" ...PASSEDsubnet mask consistency for subnet "1.1.1.0" ...PASSEDNode Connectivity ...FAILED (PRVG-12885, PRVG-12884, PRVG-2043)Multicast or broadcast check ...
Checking subnet "1.1.1.0" for multicast communication with multicast group "224.0.0.251"Multicast or broadcast check ...PASSEDVerification of node connectivity was unsuccessful on all the specified nodes. Failures were encountered during execution of CVU verification request "node connectivity".Node Connectivity ...FAILEDCheck that maximum (MTU) size packet goes through subnet ...FAILEDPRVG-12885 : ICMP packet of MTU size "9000" does not go through subnet"1.1.1.0".PRVG-12884 : Maximum (MTU) size packet check failed on subnets "1.1.1.0"orcl01: PRVG-2043 : Command "/bin/ping 1.1.1.2 -c 1 -w 3 -M do -s8972 " failed on node "orcl01" and produced the followingoutput:PING 1.1.1.2 (1.1.1.2) 8972(9000) bytes of data.--- 1.1.1.2 ping statistics ---3 packets transmitted, 0 received, 100% packet loss, time2074msCVU operation performed:      node connectivity
Date:                         Nov 11, 2025 1:10:21 PM
CVU version:                  19.28.0.0.0 (070125x8664)
Clusterware version:          19.0.0.0.0
CVU home:                     /u01/app/19.3.0/grid
Grid home:                    /u01/app/19.3.0/grid
User:                         grid
Operating system:             Linux4.18.0-553.el8_10.x86_64

CVU 检测发现 MTU 为 9000 的数据包无法通过子网:

Node Connectivity ...FAILEDCheck that maximum (MTU) size packet goes through subnet ...FAILEDPRVG-12885 : ICMP packet of MTU size "9000" does not go through subnet"1.1.1.0".PRVG-12884 : Maximum (MTU) size packet check failed on subnets "1.1.1.0"orcl01: PRVG-2043 : Command "/bin/ping 1.1.1.2 -c 1 -w 3 -M do -s8972 " failed on node "orcl01" and produced the followingoutput:PING 1.1.1.2 (1.1.1.2) 8972(9000) bytes of data.--- 1.1.1.2 ping statistics ---3 packets transmitted, 0 received, 100% packet loss, time2074ms

MTU 测试验证

使用 /bin/ping 1.1.1.2 -c 1 -w 3 -M do -s 8972 进行 MTU 大小测试:

# 测试 9000 MTU - 失败
[grid@orcl01:/home/grid]$ ping 1.1.1.2 -c 3 -M do -s 8972
PING 1.1.1.2 (1.1.1.2) 8972(9000) bytes of data.
^C
--- 1.1.1.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1021ms# 测试 1500 MTU - 成功
[grid@orcl01:/home/grid]$  ping 1.1.1.2 -c 3 -M do -s 1472
PING 1.1.1.2 (1.1.1.2) 1472(1500) bytes of data.
1480 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.177 ms
1480 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.127 ms
^C
--- 1.1.1.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1032ms
rtt min/avg/max/mdev = 0.127/0.152/0.177/0.025 ms

测试结果表明:MTU 1500 通信正常,但 MTU 9000 出现 100% 丢包。经与网络工程师确认,交换机未配置 Jumbo Frame 支持,破案了!

解决方案

交换机配置

要求网络工程师在交换机上启用 Jumbo Frame 支持,配置 MTU 大于 9000。交换机配置完成后,重新测试:

[grid@orcl01:/home/grid]$ /bin/ping 1.1.1.2 -c 1 -w 3 -M do -s 8972
PING 1.1.1.2 (1.1.1.2) 8972(9000) bytes of data.
8980 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.232 ms--- 1.1.1.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.232/0.232/0.232/0.000 ms

MTU 9000 通信测试成功。

重启集群服务

重启两个节点的集群服务:

crsctl stop crs -f
crsctl start crs

检查集群状态:

[root@orcl01:/soft]# crsctl stat res -t 
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnrONLINE  ONLINE       orcl01             STABLEONLINE  ONLINE       orcl02             STABLE
ora.chadONLINE  ONLINE       orcl01             STABLEONLINE  ONLINE       orcl02             STABLE
ora.net1.networkONLINE  ONLINE       orcl01             STABLEONLINE  ONLINE       orcl02             STABLE
ora.onsONLINE  ONLINE       orcl01             STABLEONLINE  ONLINE       orcl02             STABLE
ora.proxy_advmOFFLINE OFFLINE      orcl01             STABLEOFFLINE OFFLINE      orcl02             STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ARCH.dg(ora.asmgroup)1        ONLINE  ONLINE       orcl01             STABLE2        ONLINE  ONLINE       orcl02             STABLE
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)1        ONLINE  ONLINE       orcl01             STABLE2        ONLINE  ONLINE       orcl02             STABLE
ora.DATA.dg(ora.asmgroup)1        ONLINE  ONLINE       orcl01             STABLE2        ONLINE  ONLINE       orcl02             STABLE
ora.LISTENER_SCAN1.lsnr1        ONLINE  ONLINE       orcl02             STABLE
ora.OCR.dg(ora.asmgroup)1        ONLINE  ONLINE       orcl01             STABLE2        ONLINE  ONLINE       orcl02             STABLE
ora.asm(ora.asmgroup)1        ONLINE  ONLINE       orcl01             Started,STABLE2        ONLINE  ONLINE       orcl02             Started,STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)1        ONLINE  ONLINE       orcl01             STABLE2        ONLINE  ONLINE       orcl02             STABLE
ora.cvu1        ONLINE  ONLINE       orcl02             STABLE
ora.orcl.db1        ONLINE  ONLINE       orcl01             Open,HOME=/u01/app/oracle/product/19.3.0/db,STABLE2        ONLINE  ONLINE       orcl02             Open,HOME=/u01/app/oracle/product/19.3.0/db,STABLE
ora.qosmserver1        ONLINE  ONLINE       orcl02             STABLE
ora.scan1.vip1        ONLINE  ONLINE       orcl02             STABLE
ora.orcl01.vip1        ONLINE  ONLINE       orcl01             STABLE
ora.orcl02.vip1        ONLINE  ONLINE       orcl02             STABLE
--------------------------------------------------------------------------------

集群所有资源正常启动,问题解决。

MTU 配置原理

心跳网卡的 MTU 默认是 1500,交换机的默认 MTU 是 1500,当在系统层面修改网卡配置 MTU 为 9000 之后,交换机没有配置,这时候就会无法进行通信,100% 丢包。

为什么 Oracle 建议配置心跳网卡 MTU 为 9000?

可以参考 MOS 文档:Recommendation for the Real Application Cluster Interconnect and Jumbo Frames (Doc ID 341788.1)

配置巨型帧的优势

  1. 减少协议开销:降低 TCP、UDP 和以太网头部开销
  2. 提升吞吐量:避免数据包分片,提高传输效率
  3. 降低延迟:减少缓冲区传输次数,缩短 Oracle 块传输延迟
  4. CPU 优化:在 CPU 受限场景中显著提升性能

修改私网 MTU 配置可参考 MOS 文档:如何在 oracle 集群环境下修改私网信息 (Doc ID 2103317.1)。

写在最后

本次故障的根本原因是端到端的 MTU 配置不一致。虽然节点操作系统层面配置了 9000 字节的 MTU,但中间网络设备的 MTU 仍保持默认的 1500 字节,导致大数据包传输失败。

http://www.dtcms.com/a/600839.html

相关文章:

  • 用asp做网站怎么布局t型布局网站的优缺点
  • OpenGL lookAt 函数 参数说明
  • 【刷题笔记】 AOV网的拓扑排序
  • 3D TOF 视觉相机:以毫秒级三维感知,开启智能交互新时代
  • 快速配置 HBase 完全分布式(依赖已部署的 Hadoop+ZooKeeper)
  • 深圳网站搜索排名产品软文范例软文
  • 手机网站关键词seo网站 模板 html
  • 多模态工程师面试--准备
  • 安全迁移Windows个人文件夹至非C盘:分步教程与避坑指南
  • 多智能体框架AgentScope 1.0 深度技术剖析:架构、场景、选型与实战指南
  • flinkcdc抽取postgres数据
  • SpringCloud Gateway缓存body参数引发的问题
  • Qt跨平台:Linux与Windows
  • 【数据集分享】汽车价格预测数据集
  • 汽车网络安全综合参考架构
  • 亚远景-ISO 26262与ISO 21434:未来汽车安全标准的发展趋势
  • Leverege 携手谷歌云和BigQuery,赋能大规模企业级物联网(IoT)解决方案
  • 国外网站服务器免费网站被做跳转
  • 分享一个我自用的 Python 消息发送模块,支持邮件、钉钉、企业微信
  • 南昌商城网站建设网页设计作业文件
  • 物联网传感器数据漂移自适应补偿与精度动态校正技术
  • docker 按带ssh的python环境的容器
  • 基于深度随机森林(Deep Forest)的分类算法实现
  • Ansible:高效自动化运维工具详解
  • 调用qwen3-omni的api对本地文件生成视频文本描述(批量生成)
  • 标签分类调研
  • 太原有网站工程公司吗网站建设预招标
  • 宁夏住房和城乡建设厅网站执业资格游戏门户网站建设
  • 社区养老保障|智慧养老|基于springboot+小程序社区养老保障系统设计与实现(源码+数据库+文档)
  • 基于springboot个性化定制的智慧校园管理系统【带源码和文档】