Adaptive plans-part2-Parallel Distribution Methods

在上篇文章中介绍了adative plan关于join method特性,本文接下来验证一下Parallel Distribution Methods。

通常情况下,当并行语句执行的时候,比如进行join,排序或者聚合操作,Oracle优化器会决定并行进程间数据重新分布的方式。数据分布的方式主要依赖对于记录数的估算,所以并行语句的性能很大程度上要依靠cardinality的估算,如果估算错误,那么并行服务进程将会被过度的使用并且造成数据不能在所有的并行进程间均匀的分布。

在12c之前,oracle并行进程的数据分布方式有broadcast和hash分布方式2种,从12c开始,oracle引入了adaptive parallel distribution method,这个新的方法使用了HYBRID HASH分布方式,并且这个新的方式可以使分布方式的选择推迟到SQL执行阶段,也就是STATISTICS COLLECTOR阶段,STATISTICS COLLECTOR在build table中缓存结果并且与一个阀值进行比较,如果缓存的结果数量超过了这个阀值,那么就使用hash分布,反之将hash转换为broadcast方式,于此同时,如果build table使用了BROADCAST方式那么probe table则使用RANDOM或者ROUND ROBIN方式,否则使用hash广播。这个阀值按照DOP*2来进行计算,如果DOP为2,那么阀值为4。如果STATISTICS COLLECTOR缓存了5行,因为5>4,所以超过了这个阀值,那么并行服务进程将采用hash分布方式,否则的话采用BROADCAST方式。

实验环境12.1.0.2
实验脚本如下

alter session set OPTIMIZER_USE_SQL_PLAN_BASELINES=false;
alter session set "_bloom_filter_enabled"=false;
drop table t1 purge;
drop table t2 purge;
drop table generator purge;
create table generator as select * from dual connect by level <= 100;
create table t1(id number,skew number,name varchar2(10));
insert into t1 select rownum,1,'a' from generator where rownum <= 1;
insert into t1 select rownum,2,'b' from generator where rownum <= 2;
insert into t1 select rownum,3,'c' from generator where rownum <= 3;
insert into t1 select rownum,4,'d' from generator where rownum <= 4;
insert into t1 select rownum,5,'e' from generator where rownum <= 5;
insert into t1 select rownum,6,'f' from generator where rownum <= 6;
insert into t1 select rownum,7,'g' from generator where rownum <= 7;
insert into t1 select rownum,8,'h' from generator where rownum <= 8;
insert into t1 select rownum,9,'i' from generator where rownum <= 9;
insert into t1 select rownum,10,'j' from generator where rownum <= 10; 
commit;  
create table t2 as select * from t1; 
begin          
     dbms_stats.gather_table_stats(                 
        ownname      => user,
        tabname      =>'T1',
        method_opt   => 'for columns SKEW size 10'
    );
    dbms_stats.gather_table_stats(
        ownname      => user,
        tabname      =>'T2',
        method_opt   => 'for columns SKEW size 10'
    );
end;
/
alter session set "_px_trace"= "all"; --trace px 
select /*+ gather_plan_statistics parallel(2) pq_distribute(t1 hash hash) */
 count(*)
  from t1, t2 where t1.id = t2.id and t2.skew = 7;
alter session set "_px_trace"="none";
select * from table(dbms_xplan.display_cursor(null,null,'typical'));

select tq_id,
       server_type,
       instance,
       process,
       num_rows
from   v$pq_tqstat
order  by dfo_number,tq_id,server_type desc,instance,process;

select /*+ gather_plan_statistics parallel(4) pq_distribute(t1 hash hash) */
 count(*)
  from t1, t2 where t1.id = t2.id and t2.skew = 7;

select * from table(dbms_xplan.display_cursor(null,null,'typical'));

select tq_id,
       server_type,
       instance,
       process,
       num_rows
from   v$pq_tqstat
order  by dfo_number,tq_id,server_type desc,instance,process;

DOP为2的时候的执行计划输出

SQL> select * from table(dbms_xplan.display_cursor(null,null,'typical'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
SQL_ID  d9wc4x1t8gs8t, child number 0
-------------------------------------
select /*+ gather_plan_statistics parallel(2) pq_distribute(t1 hash
hash) */  count(*)   from t1, t2 where t1.id = t2.id and t2.skew = 7

Plan hash value: 2162392833

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |       |       |     4 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE              |          |     1 |    14 |            |          |        |      |            |
|   2 |   PX COORDINATOR             |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)       | :TQ10002 |     1 |    14 |            |          |  Q1,02 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE           |          |     1 |    14 |            |          |  Q1,02 | PCWP |            |
|*  5 |      HASH JOIN               |          |    39 |   546 |     4   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|   6 |       PX RECEIVE             |          |     7 |    42 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|   7 |        PX SEND HYBRID HASH   | :TQ10000 |     7 |    42 |     2   (0)| 00:00:01 |  Q1,00 | P->P | HYBRID HASH|
|   8 |         STATISTICS COLLECTOR |          |       |       |            |          |  Q1,00 | PCWC |            |
|   9 |          PX BLOCK ITERATOR   |          |     7 |    42 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|* 10 |           TABLE ACCESS FULL  | T2       |     7 |    42 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
|  11 |       PX RECEIVE             |          |    55 |   440 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|  12 |        PX SEND HYBRID HASH   | :TQ10001 |    55 |   440 |     2   (0)| 00:00:01 |  Q1,01 | P->P | HYBRID HASH|
|  13 |         PX BLOCK ITERATOR    |          |    55 |   440 |     2   (0)| 00:00:01 |  Q1,01 | PCWC |            |
|* 14 |          TABLE ACCESS FULL   | T1       |    55 |   440 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("T1"."ID"="T2"."ID")
  10 - access(:Z>=:Z AND :Z<=:Z)        filter("T2"."SKEW"=7)   
  14 - access(:Z>=:Z AND :Z<=:Z)

Note
-----
   - Degree of Parallelism is 2 because of hint

上述的执行计划中oracle优化器选择使用HYBRID HASH方法做数据分布,STATISTICS COLLECTOR 缓存了7条记录,2*DOP为2X2=4,小于7,优化器选择使用hash分布。查询一下v$pq_tqstat获取并行进程查询的数据分布。

SQL> select tq_id,
  2         server_type,
  3         instance,
  4         process,
  5         num_rows
  6  from   v$pq_tqstat
  7  order  by dfo_number,tq_id,server_type desc,instance,process;

     TQ_ID SERVER_TYPE                                                    INSTANCE PROCESS                                NUM_ROWS
---------- ------------------------------------------------------------ ---------- ------------------------------------ ----------
         0 Producer                                                              1 P002                                          7
         0 Producer                                                              1 P003                                          0
         0 Consumer                                                              1 P000                                          5
         0 Consumer                                                              1 P001                                          2
         1 Producer                                                              1 P002                                         55
         1 Producer                                                              1 P003                                          0
         1 Consumer                                                              1 P000                                         36
         1 Consumer                                                              1 P001                                         19
         2 Producer                                                              1 P000                                          1
         2 Producer                                                              1 P001                                          1
         2 Consumer                                                              1 QC                                            2

上述的执行计划中STATISTICS COLLECTOR 处在table queue 10000中,对应的TQ_ID=0的消费者中。2个消费者一个是5一个是2.

下面增加DOP的数量

QL> select /*+ gather_plan_statistics parallel(4) pq_distribute(t1 hash hash) */
  2   count(*)
  3    from t1, t2 where t1.id = t2.id and t2.skew = 7;

  COUNT(*)
----------
        49

SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'typical'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
SQL_ID  7gp75sk4mun08, child number 0
-------------------------------------
select /*+ gather_plan_statistics parallel(4) pq_distribute(t1 hash
hash) */  count(*)   from t1, t2 where t1.id = t2.id and t2.skew = 7

Plan hash value: 2162392833

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |       |       |     4 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE              |          |     1 |    14 |            |          |        |      |            |
|   2 |   PX COORDINATOR             |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)       | :TQ10002 |     1 |    14 |            |          |  Q1,02 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE           |          |     1 |    14 |            |          |  Q1,02 | PCWP |            |
|*  5 |      HASH JOIN               |          |    39 |   546 |     4   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|   6 |       PX RECEIVE             |          |     7 |    42 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|   7 |        PX SEND HYBRID HASH   | :TQ10000 |     7 |    42 |     2   (0)| 00:00:01 |  Q1,00 | P->P | HYBRID HASH|
|   8 |         STATISTICS COLLECTOR |          |       |       |            |          |  Q1,00 | PCWC |            |
|   9 |          PX BLOCK ITERATOR   |          |     7 |    42 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|* 10 |           TABLE ACCESS FULL  | T2       |     7 |    42 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
|  11 |       PX RECEIVE             |          |    55 |   440 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|  12 |        PX SEND HYBRID HASH   | :TQ10001 |    55 |   440 |     2   (0)| 00:00:01 |  Q1,01 | P->P | HYBRID HASH|
|  13 |         PX BLOCK ITERATOR    |          |    55 |   440 |     2   (0)| 00:00:01 |  Q1,01 | PCWC |            |
|* 14 |          TABLE ACCESS FULL   | T1       |    55 |   440 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("T1"."ID"="T2"."ID")
  10 - access(:Z>=:Z AND :Z<=:Z)        filter("T2"."SKEW"=7)   
  14 - access(:Z>=:Z AND :Z<=:Z)

Note
-----
   - Degree of Parallelism is 4 because of hint

此时由于2*DOP为8,比7大,所以优化器选择广播的方式进行数据分布,如果进行广播的方式,包含STATISTICS COLLECTOR的消费者每个都会具有相同的记录。进行通过下面的语句获取各个进程的数据分布。

SQL> select tq_id,
  2         server_type,
  3         instance,
  4         process,
  5         num_rows
  6  from   v$pq_tqstat
  7  order  by dfo_number,tq_id,server_type desc,instance,process;

     TQ_ID SERVER_TYPE                                                    INSTANCE PROCESS                                NUM_ROWS
---------- ------------------------------------------------------------ ---------- ------------------------------------ ----------
         0 Producer                                                              1 P004                                         28
         0 Producer                                                              1 P005                                          0
         0 Producer                                                              1 P006                                          0
         0 Producer                                                              1 P007                                          0
         0 Consumer                                                              1 P000                                          7
         0 Consumer                                                              1 P001                                          7
         0 Consumer                                                              1 P002                                          7
         0 Consumer                                                              1 P003                                          7
         1 Producer                                                              1 P004                                         55
         1 Producer                                                              1 P005                                          0
         1 Producer                                                              1 P006                                          0
         1 Producer                                                              1 P007                                          0
         1 Consumer                                                              1 P000                                         14
         1 Consumer                                                              1 P001                                         14
         1 Consumer                                                              1 P002                                         14
         1 Consumer                                                              1 P003                                         13
         2 Producer                                                              1 P000                                          1
         2 Producer                                                              1 P001                                          1
         2 Producer                                                              1 P002                                          1
         2 Producer                                                              1 P003                                          1
         2 Consumer                                                              1 QC                                            4

TQ10000的7条记录已经被广播(BROADCAST)给当前TQ的所有的消费者。

此条目发表在12c, CBO, Parallel分类目录。将固定链接加入收藏夹。

发表评论

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / 更改 )

Twitter picture

You are commenting using your Twitter account. Log Out / 更改 )

Facebook photo

You are commenting using your Facebook account. Log Out / 更改 )

Google+ photo

You are commenting using your Google+ account. Log Out / 更改 )

Connecting to %s