达梦分布式集群DPC_优化案例01_yxy
达梦分布式集群DPC_优化案例01_yxy
- 1 需优化sql
- 2 优化思路
- 3 执行计划分析
- 4 总结
1 需优化sql
分布式架构为3BP,1SP,1MP
FLD表: 27亿行
S表: 1亿行
SELECT COUNT(1) FROM (
SELECT *... FROM FLD INNER JOIN S
ON FLD.ID = S.ID AND FLD.PNO=S.PNO AND FLD.SID= S.SID
WHERE FLD.GCODE NOT IN ('2','4') AND FLD.VFLAG = '1' AND S.VFLAG='1' AND S.RSFLAG = '0' AND S.GCODE NOT IN ('2','4') AND S.STIME >= '2024-01-01' AND S.STIME < '2024-02-01' --AND FLD.FTIME >'2023-11-01' (后面加上的条件)GROUP BY ......) A
2 优化思路
①利用分布式的分区裁剪优化
分区裁剪优化 原理
一级分区方式都用范围分区
两个时间条件做分区列(FTIME 和 STIME),能利用分区裁剪,能减少不必要的扫描
②利用PWJ,分区智能连接优化
分区智能连接+改变分发方式 原理
二级分区方式都用HASH分区
分区列都为PNO,能够利用PWJ优化,每个节点4个子表,一共12个子表
③改变表的分发方式,提高并发来提升性能
/*+ PARALLEL(16) dpc(1 L_NO_R_NO)*/ /*+ PARALLEL(32) dpc(1 l_bro)*/
④覆盖索引
CREATE INDEX YXY2025 on FLD(...,FTIME,...) UNUSABLE;
ALTER INDEX YXY2025 rebuild SHARE ASYNCHRONOUS 32;CREATE INDEX YXY2025SFULL on S(STIME,...) UNUSABLE;
ALTER INDEX YXY2025SFULL rebuild SHARE ASYNCHRONOUS 64;
3 执行计划分析
①调整表结构(二级分区)、加覆盖索引、并发32、并走PWJ优化,执行耗时324s
加hint /*+ dpc(1 L_NO_R_NO)*/
1 #NSET2: [7019, 1, 794]
2 #PRJT2: [7019, 1, 794]; exp_num(1), is_atom(FALSE)
3 #AAGR2: [7019, 1, 794]; grp_num(0), sfun_num(1), distinct_flag[0]; slave_empty(0)
4 #ERECV: [7019, 1, 794]; stask_no(-1), l_stask_no(1), n_key(0), in_turn(0), trig(0)
5 #ESEND: [7019, 1, 794]; stask_no(1), type(DIRECT), sites(4:32), sql_invoke(0), pwj_opt(0), table(-); INFO_BITS(0x8)
6 #AAGR2: [7019, 1, 794]; grp_num(0), sfun_num(1), distinct_flag[0]; slave_empty(0)
7 #PRJT2: [7019, 5444027, 794]; exp_num(0), is_atom(FALSE)
8 #HAGR2: [7019, 5444027, 794]; grp_num(7), sfun_num(0); slave_empty(0) keys(DMTEMPVIEW_889276025.TMPCOL0, DMTEMPVIEW_889276025.TMPCOL1, DMTEMPVIEW_889276025.TMPCOL2, DMTEMPVIEW_889276025.TMPCOL3, DMTEMPVIEW_889276025.TMPCOL4, DMTEMPVIEW_889276025.TMPCOL5, DMTEMPVIEW_889276025.TMPCOL6)
9 #ERECV: [5638, 5444027, 794]; stask_no(1), l_stask_no(0), n_key(0), in_turn(0), trig(0)
10 #ESEND: [5638, 5444027, 794]; stask_no(0), type(N_DEST), sites(2:4,3:4,1:4), sql_invoke(0), pwj_opt(0), table(-) ,keys(DMTEMPVIEW_889276025.TMPCOL0) ; INFO_BITS(0xc)
11 #GI: [5638, 5444027, 794]; policy(LVL2_UNIT), gi_unit[0..1], scan_type[0](GE_L,FULL)
12 #PRJT2: [5638, 5444027, 794]; exp_num(7), is_atom(FALSE)
13 #HASH2 INNER JOIN: [5638, 5444027, 794]; KEY_NUM(3); KEY(S.ID=FLD.ID AND S.PNO=FLD.PNO AND S.SID=FLD.SID) KEY_NULL_EQU(0, 0, 0)
14 #GI: [605, 1328, 541]; policy(LVL2_UNIT), gi_unit[0..0], scan_type[0](FULL)
15 #SLCT2: [605, 1328, 541]; (S.VFLAG = '1' AND S.RSFLAG = '0' AND NOT(S.GCODE IN LIST))
16 #SSEK2: [605, 2231793, 541]; scan_type(ASC), YXY2025SFULL(S as S), scan_range[(exp_cast('2024-01-01'),min,min,min,min,min,min,min,min,min,min,min),(exp_cast('2024-02-01'),min,max,max,max,max,max,max,max,max,max,max)), is_global(0)
17 #GI: [3464, 16497054, 253]; policy(LVL2_UNIT), gi_unit[1..1], scan_type[0](FULL)
18 #SLCT2: [3464, 16497054, 253]; NOT(FLD.GCODE IN LIST)
19 #SSEK2: [3464, 17365320, 253]; scan_type(ASC), YXY2025(FLD as FLD), scan_range[('1',exp_cast('2023-11-01'),max,min,min,min,min,min),('1',max,max,max,max,max,max,max)), is_global(0)--分析
①计划走了PWJ优化,连接的两表没有任何分发操作符
②sites(2:4,3:4,1:4),指定并发没生效,因为PWJ的线程跟子表数绑定,每个节点子表数只有4个,所以并发为4
③14 #GI:...policy(LVL2_UNIT)... scan_type[0](GE_L,FULL),总控GI,避免扫描其他不相关范围分区数据
②走广播,并发调整为16,耗时160s
加hint /*+ PARALLEL(16) dpc(1 l_bro)*/
1 #NSET2: [7019, 1, 794]
2 #PRJT2: [7019, 1, 794]; exp_num(1), is_atom(FALSE)
3 #AAGR2: [7019, 1, 794]; grp_num(0), sfun_num(1), distinct_flag[0]; slave_empty(0)
4 #ERECV: [7019, 1, 794]; stask_no(-1), l_stask_no(2), n_key(0), in_turn(0), trig(0)
5 #ESEND: [7019, 1, 794]; stask_no(2), type(DIRECT), sites(4:16), sql_invoke(0), pwj_opt(0), table(-); INFO_BITS(0x8)
6 #AAGR2: [7019, 1, 794]; grp_num(0), sfun_num(1), distinct_flag[0]; slave_empty(0)
7 #PRJT2: [7019, 5444027, 794]; exp_num(0), is_atom(FALSE)
8 #HAGR2: [7019, 5444027, 794]; grp_num(7), sfun_num(0); slave_empty(0) keys(DMTEMPVIEW_889273671.TMPCOL0, DMTEMPVIEW_889273671.TMPCOL1, DMTEMPVIEW_889273671.TMPCOL2, DMTEMPVIEW_889273671.TMPCOL3, DMTEMPVIEW_889273671.TMPCOL4, DMTEMPVIEW_889273671.TMPCOL5, DMTEMPVIEW_889273671.TMPCOL6)
9 #ERECV: [5638, 5444027, 794]; stask_no(2), l_stask_no(1), n_key(0), in_turn(0), trig(0)
10 #ESEND: [5638, 5444027, 794]; stask_no(1), type(N_DEST), sites(2:16,3:16,1:16), sql_invoke(0), pwj_opt(0), table(-) ,keys(DMTEMPVIEW_889273671.TMPCOL0) ; INFO_BITS(0x8)
11 #PRJT2: [5638, 5444027, 794]; exp_num(7), is_atom(FALSE)
12 #HASH2 INNER JOIN: [5638, 5444027, 794]; KEY_NUM(3); KEY(S.ID=FLD.ID AND S.PNO=FLD.PNO AND S.SID=FLD.SID) KEY_NULL_EQU(0, 0, 0); INFO_BITS(0x1)
13 #ERECV: [605, 1328, 541]; stask_no(1), l_stask_no(0), n_key(0), in_turn(0), trig(0)
14 #ESEND: [605, 1328, 541]; stask_no(0), type(BROADCAST), sites(2:4,3:4,1:4), sql_invoke(0), pwj_opt(0), table(-); INFO_BITS(0xd)
15 #GI: [605, 1328, 541]; policy(PART_UNIT), gi_unit[0..0], scan_type[0](GE_L,FULL)
16 #SLCT2: [605, 1328, 541]; (S.VFLAG = '1' AND S.RSFLAG = '0' AND NOT(S.GCODE IN LIST))
17 #SSEK2: [605, 2231793, 541]; scan_type(ASC), YXY2025SFULL(S as S), scan_range[(exp_cast('2024-01-01'),min,min,min,min,min,min,min,min,min,min,min),(exp_cast('2024-02-01'),min,max,max,max,max,max,max,max,max,max,max)), is_global(0)
18 #GI: [3464, 16497054, 253]; policy(PART_UNIT), gi_unit[0..0], scan_type[0](G,FULL)
19 #SLCT2: [3464, 16497054, 253]; NOT(FLD.GCODE IN LIST)
20 #SSEK2: [3464, 17365320, 253]; scan_type(ASC), YXY2025(FLD as FLD), scan_range[('1',exp_cast('2023-11-01'),max,min,min,min,min,min),('1',max,max,max,max,max,max,max)), is_global(0)--分析
①计划走了广播,连接的左边为14#ESEND:...type(BROADCAST)
②并行度调整为了16[10#ESEND:...sites(2:16,3:16,1:16)]
耗时降低到160s
③走广播,并发调整为32,耗时110s
加hint /*+ PARALLEL(32) dpc(1 l_bro)*/
1 #NSET2: [7019, 1, 794]
2 #PRJT2: [7019, 1, 794]; exp_num(1), is_atom(FALSE)
3 #AAGR2: [7019, 1, 794]; grp_num(0), sfun_num(1), distinct_flag[0]; slave_empty(0)
4 #ERECV: [7019, 1, 794]; stask_no(-1), l_stask_no(2), n_key(0), in_turn(0), trig(0)
5 #ESEND: [7019, 1, 794]; stask_no(2), type(DIRECT), sites(4:32), sql_invoke(0), pwj_opt(0), table(-); INFO_BITS(0x8)
6 #AAGR2: [7019, 1, 794]; grp_num(0), sfun_num(1), distinct_flag[0]; slave_empty(0)
7 #PRJT2: [7019, 5444027, 794]; exp_num(0), is_atom(FALSE)
8 #HAGR2: [7019, 5444027, 794]; grp_num(7), sfun_num(0); slave_empty(0) keys(DMTEMPVIEW_889274813.TMPCOL0, DMTEMPVIEW_889274813.TMPCOL1, DMTEMPVIEW_889274813.TMPCOL2, DMTEMPVIEW_889274813.TMPCOL3, DMTEMPVIEW_889274813.TMPCOL4, DMTEMPVIEW_889274813.TMPCOL5, DMTEMPVIEW_889274813.TMPCOL6)
9 #ERECV: [5638, 5444027, 794]; stask_no(2), l_stask_no(1), n_key(0), in_turn(0), trig(0)
10 #ESEND: [5638, 5444027, 794]; stask_no(1), type(N_DEST), sites(2:32,3:32,1:32), sql_invoke(0), pwj_opt(0), table(-) ,keys(DMTEMPVIEW_889274813.TMPCOL0) ; INFO_BITS(0x8)
11 #PRJT2: [5638, 5444027, 794]; exp_num(7), is_atom(FALSE)
12 #HASH2 INNER JOIN: [5638, 5444027, 794]; KEY_NUM(3); KEY(S.ID=FLD.ID AND S.PNO=FLD.PNO AND S.SID=FLD.SID) KEY_NULL_EQU(0, 0, 0); INFO_BITS(0x1)
13 #ERECV: [605, 1328, 541]; stask_no(1), l_stask_no(0), n_key(0), in_turn(0), trig(0)
14 #ESEND: [605, 1328, 541]; stask_no(0), type(BROADCAST), sites(2:4,3:4,1:4), sql_invoke(0), pwj_opt(0), table(-); INFO_BITS(0xd)
15 #GI: [605, 1328, 541]; policy(PART_UNIT), gi_unit[0..0], scan_type[0](GE_L,FULL)
16 #SLCT2: [605, 1328, 541]; (S.VFLAG = '1' AND S.RSFLAG = '0' AND NOT(S.GCODE IN LIST))
17 #SSEK2: [605, 2231793, 541]; scan_type(ASC), YXY2025SFULL(S as S), scan_range[(exp_cast('2024-01-01'),min,min,min,min,min,min,min,min,min,min,min),(exp_cast('2024-02-01'),min,max,max,max,max,max,max,max,max,max,max)), is_global(0)
18 #GI: [3464, 16497054, 253]; policy(PART_UNIT), gi_unit[0..0], scan_type[0](G,FULL)
19 #SLCT2: [3464, 16497054, 253]; NOT(FLD.GCODE IN LIST)
20 #SSEK2: [3464, 17365320, 253]; scan_type(ASC), YXY2025(FLD as FLD), scan_range[('1',exp_cast('2023-11-01'),max,min,min,min,min,min),('1',max,max,max,max,max,max,max)), is_global(0)--分析
①计划走了广播,连接的左边为14#ESEND:...type(BROADCAST)
②并行度调整为了32[10#ESEND:...sites(2:32,3:32,1:32)]
耗时降低到110s
4 总结
①表结构是分布式sql优化的重中之重,能决定sql是否能走分区裁剪和分区智能连接优化
②走广播的执行计划,不一定比分区智能连接性能慢,主要是合理利用服务器资源,并发增加后性能增加
③sql再加并发,线程之间协调等开销加大,效果反而会变差,性能降低
④可以增加各个节点上的子表数来增加PWJ的连接并发数,但是子表数增加后可能线程资源成为瓶颈,需要进行业务的整体考虑
更多其他数据库相关专栏:
1.数据库优化
数据库优化基本思路、索引详解、执行计划、统计信息、CBO原理、单表优化、多表优化、分布式优化、子查询、优化案例等
数据库优化(sql优化)专栏连接
2.达梦分布式数据库:
部署详细步骤(DEM)、备份还原实战、核心特性理解、使用心得、表分区方式详细介绍、表分区最佳实践、DPC架构详解等
达梦分布式DPC专栏连接
3.应用开发类
jdbc、hibernate、ibatis、mybatis、MyBatis-Plus、Spring、中间件mycat、Sharding-JDBC等
达梦数据库应用开发专栏连接