Adaptive plan-part1-adaptive join method

CBO通过统计信息决定SQL的最优执行计划,如果统计信息不准确,谓词过于复杂,或者表连接基数估算不够准确便有可能导致错误的执行计划。在12c之前,生成执行计划后,sql执行期间便无法改变执行计划。但是从Oracle 12c开始,优化器引入了Adaptive Query Optimization技术,便是为了避免使用效率低下的执行计划。该特性包含adaptive query optimization和adaptive statistics两个方面。

Adaptive Plans可以被归类为2类,一类是join method,一类是Parallel distribution method。adaptive join method是一个可以让oracle的优化器自己学习的机制,可以使oracle优化器在统计信息不是最优的情况下在运行时间进行调整,所以优化器可以在语句的执行期间根据统计信息的实际情况改变执行计划。对于Adaptive Join Methods,一旦Oracle优化器决定了最终的执行计划,便会在后续的SQL语句执行过程中沿用该执行计划,并不再使用statistics collectors产生更优的执行计划,直到这个执行计划被刷出shared poo或者使用了不同的Oracle优化器特性(ACS、statistics feedback等)而重新触发statistics collectors。

数据库使用adaptive plan的前提是OPTIMIZER_FEATURES_ENABLE的版本为12.1.0.1及以上并且OPTIMIZER_ADAPTIVE_REPORTING_ONLY为默认的false。

本文讨论的是join method,实验环境为Oracle 12.1.0.2 OEL Linux
实验脚本如下

drop table t1 purge;
drop table generator purge;
create table generator as select * from dual connect by level <= 1000000;
create table t1(id number,skew number,name varchar2(10));
insert into t1 select rownum,1,'a' from generator where rownum <= 10;
insert into t1 select rownum,2,'b' from generator where rownum <= 100;
insert into t1 select rownum,3,'c' from generator where rownum <= 1000;
insert into t1 select rownum,4,'d' from generator where rownum <= 10000;
insert into t1 select rownum,5,'e' from generator where rownum <= 100000;
insert into t1 select rownum,6,'f' from generator where rownum <= 1000000; 
commit; 
create index idx_t1_id on t1(id); 
create index idx_t1_skew on t1(skew); 
select skew,count(*) from t1 group by skew order by 2; 
begin   dbms_stats.gather_table_stats(      
                user,       
                't1',       
                cascade => true,
        estimate_percent => null,
        method_opt => 'for all columns size 1'
    );
end;
/
select /*+ gather_plan_statistics */
 count(*)
  from t1 v1, t1 v2 where v1.id = v2.id and v2.skew = 4;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
select * from table(dbms_xplan.display_cursor('fgbaykba91tbr',null,'adaptive allstats'));

delete from t1 where skew = 4;
insert into t1 select rownum,4,'d' from generator where rownum = 1;
select /*+ gather_plan_statistics */
 count(*)
  from t1 v1, t1 v2 where v1.id = v2.id and v2.skew = 4;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
alter system flush shared_pool;
select /*+ gather_plan_statistics */
 count(*)
  from t1 v1, t1 v2 where v1.id = v2.id and v2.skew = 4;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
select * from table(dbms_xplan.display_cursor('dv5xbtya6saum',null,'adaptive allstats'));

SQL> select skew,count(*) from t1 group by skew order by 2;

      SKEW   COUNT(*)
---------- ----------
         1         10
         2        100
         3       1000
         4      10000
         5     100000
         6    1000000

实验输出如下

SQL> select /*+ gather_plan_statistics */
  2   count(*)
  3    from t1 v1, t1 v2 where v1.id = v2.id and v2.skew = 4;

  COUNT(*)
----------
     31110

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
SQL_ID  dv5xbtya6saum, child number 0
-------------------------------------
select /*+ gather_plan_statistics */  count(*)   from t1 v1, t1 v2
where v1.id = v2.id and v2.skew = 4

Plan hash value: 1903897650

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:00.30 |    4592 |       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.30 |    4592 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |    205K|  31110 |00:00:00.10 |    4592 |  2440K|  2440K| 3026K (0)|
|*  3 |    TABLE ACCESS FULL| T1   |      1 |    185K|  10000 |00:00:00.01 |    2296 |       |       |          |
|   4 |    TABLE ACCESS FULL| T1   |      1 |   1111K|   1111K|00:00:00.10 |    2296 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("V1"."ID"="V2"."ID")
   3 - filter("V2"."SKEW"=4)

Note
-----
   - this is an adaptive plan

--使用dbms_xplan + adaptive参数输出adaptive plan的计划
SQL> select * from table(dbms_xplan.display_cursor('dv5xbtya6saum',null,'adaptive allstats'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------
SQL_ID  dv5xbtya6saum, child number 0
-------------------------------------
select /*+ gather_plan_statistics */  count(*)   from t1 v1, t1 v2
where v1.id = v2.id and v2.skew = 4

Plan hash value: 1903897650

-----------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
-----------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT         |           |      1 |        |      1 |00:00:00.30 |    4592 |       |       |          |
|     1 |  SORT AGGREGATE          |           |      1 |      1 |      1 |00:00:00.30 |    4592 |       |       |          |
|  *  2 |   HASH JOIN              |           |      1 |    205K|  31110 |00:00:00.10 |    4592 |  2440K|  2440K|     1/0/0|
|-    3 |    NESTED LOOPS          |           |      1 |    205K|  10000 |00:00:00.01 |    2296 |       |       |          |
|-    4 |     STATISTICS COLLECTOR |           |      1 |        |  10000 |00:00:00.01 |    2296 |       |       |          |
|  *  5 |      TABLE ACCESS FULL   | T1        |      1 |    185K|  10000 |00:00:00.01 |    2296 |       |       |          |
|- *  6 |     INDEX RANGE SCAN     | IDX_T1_ID |      0 |      1 |      0 |00:00:00.01 |       0 |       |       |          |
|     7 |    TABLE ACCESS FULL     | T1        |      1 |   1111K|   1111K|00:00:00.11 |    2296 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("V1"."ID"="V2"."ID")
   5 - filter("V2"."SKEW"=4)
   6 - access("V1"."ID"="V2"."ID")

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

对记录进行更新操作,使skew=4的记录只保留一条。

SQL> delete from t1 where skew = 4;

10000 rows deleted.

SQL>
SQL> insert into t1 select rownum,4,'d' from generator where rownum = 1;

1 row created.

此时执行计划进行nest loop join才是最优的执行计划,那么我们看下实际的运行情况是否如此

SQL> select /*+ gather_plan_statistics */
  2   count(*)
  3    from t1 v1, t1 v2 where v1.id = v2.id and v2.skew = 4;

  COUNT(*)
----------
         6

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID  dv5xbtya6saum, child number 0
-------------------------------------
select /*+ gather_plan_statistics */  count(*)   from t1 v1, t1 v2
where v1.id = v2.id and v2.skew = 4

Plan hash value: 1903897650

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:00.31 |    4592 |       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.31 |    4592 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |    205K|      6 |00:00:00.06 |    4592 |  2440K|  2440K|  778K (0)|
|*  3 |    TABLE ACCESS FULL| T1   |      1 |    185K|      1 |00:00:00.06 |    2296 |       |       |          |
|   4 |    TABLE ACCESS FULL| T1   |      1 |   1111K|   1101K|00:00:00.12 |    2296 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("V1"."ID"="V2"."ID")
   3 - filter("V2"."SKEW"=4)

Note
-----
   - this is an adaptive plan

从上面的输出看到,Oracle优化器并没有选择nest loop进行关联操作。
接下来我将执行计划刷新出共享池(或者使用dbms_shared_pool.purge()),执行计划改为Nest Loop。

SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> select /*+ gather_plan_statistics */
  2   count(*)
  3    from t1 v1, t1 v2 where v1.id = v2.id and v2.skew = 4;

  COUNT(*)
----------
         6

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
----------------------------------------
SQL_ID  dv5xbtya6saum, child number 0
-------------------------------------
select /*+ gather_plan_statistics */  count(*)   from t1 v1, t1 v2
where v1.id = v2.id and v2.skew = 4

Plan hash value: 2926476411

-------------------------------------------------------------------------------------------
| Id  | Operation           | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |      1 |        |      1 |00:00:00.06 |    2299 |
|   1 |  SORT AGGREGATE     |           |      1 |      1 |      1 |00:00:00.06 |    2299 |
|   2 |   NESTED LOOPS      |           |      1 |    205K|      6 |00:00:00.06 |    2299 |
|*  3 |    TABLE ACCESS FULL| T1        |      1 |    185K|      1 |00:00:00.06 |    2296 |
|*  4 |    INDEX RANGE SCAN | IDX_T1_ID |      1 |      1 |      6 |00:00:00.01 |       3 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("V2"."SKEW"=4)
   4 - access("V1"."ID"="V2"."ID")

Note
-----
   - this is an adaptive plan

SQL> select * from table(dbms_xplan.display_cursor('dv5xbtya6saum',null,'adaptive allstats'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
SQL_ID  dv5xbtya6saum, child number 0
-------------------------------------
select /*+ gather_plan_statistics */  count(*)   from t1 v1, t1 v2
where v1.id = v2.id and v2.skew = 4

Plan hash value: 2926476411

--------------------------------------------------------------------------------------------------
|   Id  | Operation                | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT         |           |      1 |        |      1 |00:00:00.06 |    2299 |
|     1 |  SORT AGGREGATE          |           |      1 |      1 |      1 |00:00:00.06 |    2299 |
|- *  2 |   HASH JOIN              |           |      1 |    205K|      6 |00:00:00.06 |    2299 |
|     3 |    NESTED LOOPS          |           |      1 |    205K|      6 |00:00:00.06 |    2299 |
|-    4 |     STATISTICS COLLECTOR |           |      1 |        |      1 |00:00:00.06 |    2296 |
|  *  5 |      TABLE ACCESS FULL   | T1        |      1 |    185K|      1 |00:00:00.06 |    2296 |
|  *  6 |     INDEX RANGE SCAN     | IDX_T1_ID |      1 |      1 |      6 |00:00:00.01 |       3 |
|-    7 |    TABLE ACCESS FULL     | T1        |      0 |   1111K|      0 |00:00:00.01 |       0 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("V1"."ID"="V2"."ID")
   5 - filter("V2"."SKEW"=4)
   6 - access("V1"."ID"="V2"."ID")

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

上述现象的潜在问题就是当数据实际已经变化了,但是执行计划并没有选择最优的去执行,只有在进行硬解析的时候才会选择最优,这类似于绑定变量窥探造成的问题。
为了避免这些由于这个特性造成的影响,最好的方式就是了解自己的数据库的数据分布并且对使用了adaptive plan的语句进行监控,可以针对v$sql或者v$sql_plan的 IS_RESOLVED_ADAPTIVE_PLAN列进行监控。
例如

SQL> select IS_RESOLVED_ADAPTIVE_PLAN, s.sql_text
  2    from v$sql s
  3   where sql_id = 'dv5xbtya6saum';

IS_ SQL_TEXT
--- ------------------------------------------------------------
Y   select /*+ gather_plan_statistics */  count(*)   from t1 v1,
     t1 v2 where v1.id = v2.id and v2.skew = 4

此条目发表在12c, CBO分类目录。将固定链接加入收藏夹。

One Response to Adaptive plan-part1-adaptive join method

  1. Pingback引用通告: Adaptive plans-part2-Parallel Distribution Methods | LEO Notes

发表评论

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / 更改 )

Twitter picture

You are commenting using your Twitter account. Log Out / 更改 )

Facebook photo

You are commenting using your Facebook account. Log Out / 更改 )

Google+ photo

You are commenting using your Google+ account. Log Out / 更改 )

Connecting to %s