sql优化子查询展开执行计划测试
1、什么是子查询展开
子查询展开是优化器处理带子查询的sql的一种手段,它是指优化器不在将子查询当作一个独立的单元进行处理,而是将子查询里的表将其外部查询之间做等价关联,这种关联分为两种:
a、将子查询拆开,即将子查询里的表或视图与外部的表直接做关联
b、不拆开子查询,而是将子查询当作内联视图与外部的表视图做关联
对于b类,10g以后优化器会计算展开之后的cost值,只有cost变小之后才会做展开
如果子查询不展开,那么通常会在查询的最后才执行,并且会走filter类型的执行计划,尤其此时当子查询有两个以上的表时,子查询展开通常会提高性能,因为展开会有多种关联路径选择,如hash jion
当子查询条件为如下之一,并且满足一定条件会展开:
single-row(=,<,>,<=,>=,<>)
exists
not exists
in
not in
any
all
2、子查询展开的第一种情况
将子查询拆开,即将子查询里的表或视图与外部的表直接做关联
看一个子查询展开的实例(三种写法等价)
create table t1 as select * from dba_objects;
create table t2 as select * from dba_objects;
select * from t1 where object_id in (select object_id from t2 where t2.object_id>40000);
select * from t1 d where exists (select 1 from t2 e where d.object_id=e.object_id and e.object_id>40000);
select * from t1 where object_id =any (select object_id from t2 where t2.object_id>40000);
先加入no_unnest让子查询不展开
set autotrace traceonly
set timing on
SQL> select * from t1 where object_id in (select /*+ no_unnest */ object_id from t2 where t2.object_id>40000); 46781 rows selected. Elapsed: 00:05:19.52 Execution Plan ---------------------------------------------------------- Plan hash value: 2087281748 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 32 | 6624 | 8213 (1)| 00:01:39 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | T1 | 83819 | 16M| 345 (1)| 00:00:05 | |* 3 | FILTER | | | | | | |* 4 | TABLE ACCESS FULL| T2 | 478 | 6214 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T2" "T2" WHERE 40000<:B1 AND "OBJECT_ID"=:B2 AND "T2"."OBJECT_ID">40000)) 3 - filter(40000<:B1) 4 - filter("OBJECT_ID"=:B1 AND "T2"."OBJECT_ID">40000) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 14 recursive calls 0 db block gets 42262849 consistent gets 0 physical reads 0 redo size 2466961 bytes sent via SQL*Net to client 34818 bytes received via SQL*Net from client 3120 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 46781 rows processed |
可以看到,执行计划的确走了filter,并且子查询是最后一步才执行,仅仅起到一个过滤t1表那83819行结果的作用
这也验证了之前的观点:不能做子查询展开的子查询通常会在执行计划最后一部才执行,并且会走filter类型的执行计划
全表扫描t1的cardinality是83819,object_id是t1的主键,优化器要以驱动条件"T2"."OBJECT_ID"=:B1执行83819次子查询,这直接导致了sql执行时间为5m19s,逻辑读42262849
将no_unnest去掉,执行上述sql(为了更好体现子查询展开,这里使用了leading的hint)
SQL> select /*+ leading(t1) */ * from t1 where object_id in (select object_id from t2 where t2.object_id>40000); 46781 rows selected. Elapsed: 00:00:02.52 Execution Plan ---------------------------------------------------------- Plan hash value: 1713220790 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 32758 | 7037K| | 1086 (1)| 00:00:14 | |* 1 | HASH JOIN SEMI | | 32758 | 7037K| 7008K| 1086 (1)| 00:00:14 | |* 2 | TABLE ACCESS FULL| T1 | 32758 | 6621K| | 345 (1)| 00:00:05 | |* 3 | TABLE ACCESS FULL| T2 | 47814 | 607K| | 345 (1)| 00:00:05 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("OBJECT_ID"="OBJECT_ID") 2 - filter("OBJECT_ID">40000) 3 - filter("T2"."OBJECT_ID">40000) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5557 consistent gets 0 physical reads 0 redo size 2466961 bytes sent via SQL*Net to client 34818 bytes received via SQL*Net from client 3120 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 46781 rows processed |
可以看到,优化器将t2表拿出来,与t1表做hash join,而且将"T2"."OBJECT_ID">40000运用到过滤T1表上,好处显而易见,两张表都只需要访问一次,而不是之前的83819次
子查询展开的好处在执行时间,逻辑读,cost均有体现
如上子查询展开之后变为hash join半连接(HASH JOIN SEMI),如果被驱动表(T2)的列object_id不存在重复值,此时会被转换成内链接(hash join),如有一个唯一索引
与半连接对应的是反连接(Anti join)
select /*+ leading(t1) */ * from t1 where object_id not in (select object_id from t2 where t2.object_id>40000);
如果表t2的object_id没有null值,则等价与如下语句
select * from t1 where object_id <>all (select object_id from t2 where t2.object_id>40000);
select * from t1 where not exists (select 1 from t2 where t1.object_id=t2.object_id and t2.object_id>40000);
SQL> select /*+ leading(t1) */ * from t1 where object_id not in (select object_id from t2 where t2.object_id>40000); 39660 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1270581391 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 37966 | 3818K| | 1180 (1)| 00:00:15 | |* 1 | HASH JOIN ANTI SNA| | 37966 | 3818K| 9288K| 1180 (1)| 00:00:15 | | 2 | TABLE ACCESS FULL| T1 | 86443 | 8272K| | 345 (1)| 00:00:05 | |* 3 | TABLE ACCESS FULL| T2 | 48477 | 236K| | 345 (1)| 00:00:05 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("OBJECT_ID"="OBJECT_ID") 3 - filter("T2"."OBJECT_ID">40000) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2478 consistent gets 0 physical reads 0 redo size 3670337 bytes sent via SQL*Net to client 29593 bytes received via SQL*Net from client 2645 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 39660 rows processed |
可以看到oracle已经把子查询(select object_id from t2 where t2.object_id>40000)展开,与外部的表做hash反连接
3、子查询展开的第二种情况
不拆开子查询,而是将子查询当作内联视图与外部的表视图做关联
select * from t1 where object_id not in (select t2.object_id from t2,t3 where t2.object_id=t3.object_id and t2.object_id>40000); Execution Plan ---------------------------------------------------------- Plan hash value: 2341321949 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 37968 | 4115K| | 1543 (1)| 00:00:19 | |* 1 | HASH JOIN RIGHT ANTI SNA| | 37968 | 4115K| 1184K| 1543 (1)| 00:00:19 | | 2 | VIEW | VW_NSO_1 | 48475 | 615K| | 690 (1)| 00:00:09 | |* 3 | HASH JOIN | | 48475 | 852K| | 690 (1)| 00:00:09 | |* 4 | TABLE ACCESS FULL | T2 | 48477 | 236K| | 345 (1)| 00:00:05 | |* 5 | TABLE ACCESS FULL | T3 | 50981 | 647K| | 345 (1)| 00:00:05 | | 6 | TABLE ACCESS FULL | T1 | 86443 | 8272K| | 345 (1)| 00:00:05 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("OBJECT_ID"="OBJECT_ID") 3 - access("T2"."OBJECT_ID"="T3"."OBJECT_ID") 4 - filter("T2"."OBJECT_ID">40000) 5 - filter("T3"."OBJECT_ID">40000) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 17 recursive calls 0 db block gets 6582 consistent gets 1235 physical reads 0 redo size 2026204 bytes sent via SQL*Net to client 29604 bytes received via SQL*Net from client 2646 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 39675 rows processed |
可以看到,oracle没有将子查询直接拆开与t1表关联,而是先将t2和t3关联,形成内敛视图(VW_NSO_1),然后在与t1表做hash反连接
子查询是否能做展开取决于以下两个条件
a、子查询展开所对应的等价改写sql与原sql必须语义完全一致,不一致必然不能做展开
b、对于子查询展开的第二种情形,只有当子查询展开的cost比原sql cost低时才会做展开
对于子查询展开的第一种情形,不管展开前后cost为多少,都会做展开
以下为验证:
select /*+ cardinality(emp 100000) unnest */ * from emp where deptno in (select /*+ cardinality(dept 100000) */ deptno from dept);
select /*+ cardinality(emp 1) unnest */ * from emp where deptno in (select /*+ cardinality(dept 1) no_unnest */ deptno from dept);
从下方可见子查询展开cost更高,但oracle还是选择展开
select /*+ cardinality(emp 100000) unnest */ * from emp where deptno in (select /*+ cardinality(dept 100000) */ deptno from dept); Execution Plan ---------------------------------------------------------- Plan hash value: 1114788177 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 87879 | 3604K| 276 (2)| 00:00:04 | |* 1 | HASH JOIN | | 87879 | 3604K| 276 (2)| 00:00:04 | | 2 | SORT UNIQUE | | 100K| 292K| 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| DEPT | 100K| 292K| 3 (0)| 00:00:01 | |* 4 | TABLE ACCESS FULL | EMP | 100K| 3808K| 3 (0)| 00:00:01 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("DEPTNO"="DEPTNO") 3 - filter("DEPTNO" IS NOT NULL) 4 - filter("DEPTNO" IS NOT NULL) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 13 consistent gets 0 physical reads 0 redo size 1419 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 10 rows processed select /*+ cardinality(emp 1) unnest */ * from emp where deptno in (select /*+ cardinality(dept 1) no_unnest */ deptno from dept); Execution Plan ---------------------------------------------------------- Plan hash value: 1499841400 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | 6 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL| EMP | 1 | 39 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| DEPT | 1 | 3 | 3 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT /*+ NO_UNNEST OPT_ESTIMATE (TABLE "DEPT" ROWS=1.000000 ) */ 0 FROM "DEPT" "DEPT" WHERE "DEPTNO"=:B1)) 3 - filter("DEPTNO"=:B1) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 31 consistent gets 0 physical reads 0 redo size 1419 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed |