关于Oracle SGA内存抖动
从Oracle 10g开始,Oracle推出了SGA自动管理特性(即Automatic Shared Memory Management,简称ASMM)。在设置了SGA_TARGET参数之后,BUFFER CACHE和SHARED POOL两大内存组件之间可以根据需要自适应地从SGA中分配内存。如当BUFFER CACHE内存不够时,Oracle可以从SHARED POOL中“夺”取内存。当SHARED POOL内存不够时,当然也可以从BUFFER CACHE中“夺”取内存。通过这种自动机制,在某种程度上避免了由于某个组件内存不足而导致性能下降的问题。 在Oracle 10g中,以下内存组件受SGA_TARGET参数影响:
Buffer cache (DB_CACHE_SIZE)。
Shared pool (SHARED_POOL_SIZE) 。
Large pool (LARGE_POOL_SIZE)。
Java pool (JAVA_POOL_SIZE)。
Streams pool (STREAMS_POOL_SIZE)。
以下内存组件并不受SGA_TARGET参数影响:
log buffer (log_buffer)
keep,recycle pool,nk buffer pool。
默认情况下,自动调整的内存组件的参数值为0,如下所示:
SQL> select name,value from v$parameter 2 where name in ('db_cache_size','java_pool_size','large_pool_size',3 'shared_pool_size','streams_pool_size');NAME VALUE
-------------------- --------------------
shared_pool_size 0
large_pool_size 0
java_pool_size 0
streams_pool_size 0
db_cache_size 0
自动调整的内存组件参数以隐含参数的形式保存在SPFILE参数文件中,所以Oracle重启之后可以“记住”曾经使用的内存值,如下所示:
ora10205.__db_cache_size=394264576
ora10205.__java_pool_size=4194304
ora10205.__large_pool_size=4194304
ora10205.__shared_pool_size=104857600
ora10205.__streams_pool_size=0
但是,并不是设置了SGA_TARGET参数就万事大吉,各内存组件之间的内存就可以自由随意地分配了。实践表明,当内存组件(尤其是SHARED POOL和BUFFER CACHE)之间的内存夺取剧烈时,实例会短时间HANG住,进而影响业务。通过查询V$SGA_RESIZE_OPS视图可以观察系统内存抖动的情况。以下为笔者的一套数据库,BUFFER CACHE和SHARED POOL组件之间的内存抖动相当厉害,在1分钟之内BUFFER CACHE和SHARED POOL在频繁地更改大小,如下所示:
SQL> select component,oper_type,target_size,final_size,start_time,end_time 2 from v$sga_resize_ops order by start_time desc; COMPONENT OPER_TYPE TARGET_SIZE FINAL_SIZE START_TIME END_TIME
-------------------- ---------- ------------ ------------ -------------------- -------------------
DEFAULT buffer cache GROW 54760833024 54760833024 2012-11-01 15:25:23 2012-11-01 15:25:25
shared pool SHRINK 8455716864 8455716864 2012-11-01 15:25:23 2012-11-01 15:25:25
DEFAULT buffer cache GROW 54626615296 54626615296 2012-11-01 15:24:53 2012-11-01 15:24:54
shared pool SHRINK 8589934592 8589934592 2012-11-01 15:24:53 2012-11-01 15:24:54
shared pool GROW 8724152320 8724152320 2012-11-01 15:19:17 2012-11-01 15:19:17
DEFAULT buffer cache SHRINK 54492397568 54492397568 2012-11-01 15:19:17 2012-11-01 15:19:17
shared pool SHRINK 8589934592 8589934592 2012-11-01 15:19:16 2012-11-01 15:19:17
DEFAULT buffer cache GROW 54626615296 54626615296 2012-11-01 15:19:16 2012-11-01 15:19:17
shared pool SHRINK 8724152320 8724152320 2012-11-01 15:18:48 2012-11-01 15:18:49
DEFAULT buffer cache GROW 54492397568 54492397568 2012-11-01 15:18:48 2012-11-01 15:18:49
shared pool GROW 8858370048 8858370048 2012-11-01 15:17:06 2012-11-01 15:17:07
DEFAULT buffer cache SHRINK 54358179840 54358179840 2012-11-01 15:17:06 2012-11-01 15:17:07
所以建议在设置SGA_TARGET参数时,最好同时设置SHARED_POOL_SIZE和DB_CACHE_SIZE参数,使得SHARED POOL和BUFFER CACHE值维持在参数值之上,进而降低内存抖动的频率。但是也不能过高地设置SHARED_POOL_SIZE和DB_CACHE_SIZE参数,两者之和不要接近SGA_TARGET参数值,否则不仅不能很好地利用Oracle 10g ASMM特性,也容易引起内存组件内存分配相关的故障。
提示 在ASSM中,可以设置_memory_management_tracing为7观察各内存组件的分配过程。但是不要在生产系统中随意设置!
除了系统动态自动扩展内存组件之外,也可以通过命令手动扩展内存组件。正常情况下,不建议在业务高峰期手动扩展内存组件(尤其是在低版本的数据库中),因为笔者多次碰到手动扩展内存导致实例HANG住的情况,DBA所做的是个高危工作,如果一不小心踩到“地雷”就可能会“粉身碎骨”。以下为笔者近期处理的一则案例。
某客户的数据库版本为10.2.0.5,突然出现数据库无法连接错误(通过后台也无法登录),故障期间警告日志出现大量ORA-04031错误,如下所示:
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","X$KSUXSINST","KGLS heap","kglhin: temp")
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred
Warning: out of shared memory loading library cache object [handle=70000020afa8850] SYS.X$KSUXSINST
查看后台跟踪文件,发现SHARED POOL中的SUNHEAP空闲内存为0,如下所示:
HEAP DUMP heap name="sql area" desc=7000001e8a122c8extent sz=0xfe8 alt=32767 het=312 rec=0 flg=2 opc=2parent=700000010000058 owner=7000001e8a121d0 nex=0 xsz=0x1000000Subheap has 0 bytes of memory allocated
由于SUBHEAP的空闲内存为0,部分会话出现了SGA: allocation forcing component growth等待事件,如下所示:
Dumping Session Wait Historyfor 'SGA: allocation forcing component growth' count=1 wait_time=0.000115 sec=0, =0, =0for 'rdbms ipc message' count=1 wait_time=0.937516 sectimeout=60, =0, =0for 'SGA: allocation forcing component growth' count=1 wait_time=0.000061 sec=0, =0, =0for 'SGA: allocation forcing component growth' count=1 wait_time=0.000065 sec=0, =0, =0for 'SGA: allocation forcing component growth' count=1 wait_time=0.000061 sec=0, =0, =0for 'SGA: allocation forcing component growth' count=1 wait_time=0.000066 sec=0, =0, =0for 'SGA: allocation forcing component growth' count=1 wait_time=0.000062 sec=0, =0, =0for 'SGA: allocation forcing component growth' count=1 wait_time=0.000075 sec=0, =0, =0for 'rdbms ipc message' count=1 wait_time=1.953144 sectimeout=c8, =0, =0for 'SGA: allocation forcing component growth' count=1 wait_time=0.000071 sec=0, =0, =0
内存组件之间相互夺取内存是有可能出现SGA: allocation forcing component growth等待事件的。进一步检查,发现在SGA_TARGET参数为7904MB,DB_CACHE_SIZE被设置为6 448MB,SHARED_POOL_SIZE被设置为1 280MB
SGA_TARGET的参数值接近于DB_CACHE_SIZE和SHARED_POOL_SIZE两者之和。当SHARED POOL内存紧张时,由于无法从BUFFER CACHE中获得更多的空余内存,因此严重影响了SHARED POOL及其他内存组件的动态扩展。
事后调查发现,管理该数据库的DBA前几天刚把DB_CACHE_SIZE参数从4GB调整为6GB,进而导致SHARED POOL内存不足时无法自动扩展,警告日志显示如下:
Sat Nov 01 20:19:30 CST 2012
ALTER SYSTEM SET db_cache_size='6448M' SCOPE=BOTH;