后浪云OceanBase教程:OceanBase 分布式计划的生成

OceanBase 数据库的优化器会分为以下两大阶段来生成分布式的执行计划。

1. 第一阶段:不考虑数据的物理分布,生成所有基于本地关系优化的最优执行计划。在本地计划生成后,优化器会检查数据是否访问了多个分区,或者是否访问的是本地单分区表但是用户使用 HINT 强制采用了并行查询执行。

2. 第二阶段:生成分布式计划。根据执行计划树,在需要进行数据重分布的地方,插入 EXCHANGE 节点,从而将原先的本地计划树变成分布式执行计划。

分布式执行计划的算子

生成分布式计划的过程就是在原始计划树上寻找恰当位置插入 EXCHANGE 算子的过程,在自顶向下遍历计划树的时候,需要根据相应算子的数据处理情况以及输入算子的数据分区情况,来决定是否需要插入 EXCHANGE 算子。

如下示例为最简单的单表扫描:


obclient>CREATE TABLE t1 (v1 INT, v2 INT) PARTITION BY HASH(v1) PARTITIONS 5;
Query OK, 0 rows affected (0.12 sec)

obclient>EXPLAIN SELECT * FROM t1\G;
*************************** 1. row ***************************
Query Plan:
==============================================
|ID|OPERATOR               |NAME    |EST. ROWS|COST  |
------------------------------------------------------
|0 |PX COORDINATOR         |        |500000   |545109|
|1 | EXCHANGE OUT DISTR    |:EX10000|500000   |320292|
|2 |  PX PARTITION ITERATOR|        |500000   |320292|
|3 |   TABLE SCAN          |T1      |500000   |320292|
======================================================

Outputs & filters:
-------------------------------------
  0 - output([T1.V1], [T1.V2]), filter(nil)
  1 - output([T1.V1], [T1.V2]), filter(nil), dop=1
  2 - output([T1.V1], [T1.V2]), filter(nil)
  3 - output([T1.V1], [T1.V2]), filter(nil),
      access([T1.V1], [T1.V2]), partitions(p[0-4])

当 t1 是一个分区表,可以在 TABLE SCAN 上插入配对的 EXCHANGE 算子,从而将 TABLE SCAN 和 EXCHANGE OUT 封装成一个 job,可以用于并行的执行。

单输入可下压算子

单输入可下压算子主要包括 AGGREGATION、SORT、GROUP BY 和 LIMIT 算子等,除了 LIMIT 算子以外,其余所列举的算子都会有一个操作的键,如果操作的键和输入数据的数据分布是一致的,则可以做一阶段聚合操作,也即 Partition Wise Aggregation。如果操作的键和输入数据的数据分布是不一致的,则需要做两阶段聚合操作,聚合算子需要做下压操作。

一阶段聚合操作如下例所示:


obclient>CREATE TABLE t2 (v1 INT, v2 INT) PARTITION BY HASH(v1) PARTITIONS 4;
Query OK, 0 rows affected (0.12 sec)

obclient>EXPLAIN SELECT SUM(v1) FROM t2 GROUP BY v1\G;
*************************** 1. row ***************************
Query Plan:
| ======================================================
|ID|OPERATOR               |NAME    |EST. ROWS|COST  |
------------------------------------------------------
|0 |PX COORDINATOR         |        |101      |357302|
|1 | EXCHANGE OUT DISTR    |:EX10000|101      |357297|
|2 |  PX PARTITION ITERATOR|        |101      |357297|
|3 |   MERGE GROUP BY      |        |101      |357297|
|4 |    TABLE SCAN         |t2      |400000   |247403|
======================================================

Outputs & filters:
-------------------------------------
  0 - output([T_FUN_SUM(t2.v1)]), filter(nil)
  1 - output([T_FUN_SUM(t2.v1)]), filter(nil), dop=1
  2 - output([T_FUN_SUM(t2.v1)]), filter(nil)
  3 - output([T_FUN_SUM(t2.v1)]), filter(nil),
      group([t2.v1]), agg_func([T_FUN_SUM(t2.v1)])
  4 - output([t2.v1]), filter(nil),
      access([t2.v1]), partitions(p[0-3])

二阶段聚合操作如下例所示:


| ============================================================
|ID|OPERATOR                     |NAME    |EST. ROWS|COST  |
------------------------------------------------------------
|0 |PX COORDINATOR               |        |101      |561383|
|1 | EXCHANGE OUT DISTR          |:EX10001|101      |561374|
|2 |  HASH GROUP BY              |        |101      |561374|
|3 |   EXCHANGE IN DISTR         |        |101      |408805|
|4 |    EXCHANGE OUT DISTR (HASH)|:EX10000|101      |408795|
|5 |     HASH GROUP BY           |        |101      |408795|
|6 |      PX PARTITION ITERATOR  |        |400000   |256226|
|7 |       TABLE SCAN            |t2      |400000   |256226|
============================================================

Outputs & filters:
-------------------------------------
  0 - output([T_FUN_SUM(T_FUN_SUM(t2.v1))]), filter(nil)
  1 - output([T_FUN_SUM(T_FUN_SUM(t2.v1))]), filter(nil), dop=1
  2 - output([T_FUN_SUM(T_FUN_SUM(t2.v1))]), filter(nil),
      group([t2.v2]), agg_func([T_FUN_SUM(T_FUN_SUM(t2.v1))])
  3 - output([t2.v2], [T_FUN_SUM(t2.v1)]), filter(nil)
  4 - (#keys=1, [t2.v2]), output([t2.v2], [T_FUN_SUM(t2.v1)]), filter(nil), dop=1
  5 - output([t2.v2], [T_FUN_SUM(t2.v1)]), filter(nil),
      group([t2.v2]), agg_func([T_FUN_SUM(t2.v1)])
  6 - output([t2.v1], [t2.v2]), filter(nil)
  7 - output([t2.v1], [t2.v2]), filter(nil),
      access([t2.v1], [t2.v2]), partitions(p[0-3])

二元输入算子

二元输入算子主要考虑 JOIN 算子的情况。对于 JOIN 算子来说,主要基于规则来生成分布式执行计划和选择数据重分布方法。JOIN 算子主要有以下三种联接方式:

  • Partition-Wise Join

    当左右表都是分区表且分区方式相同,物理分布一样,并且 JOIN 的联接条件为分区键时,可以使用以分区为单位的联接方法。如下例所示:

    obclient>CREATE TABLE t3 (v1 INT, v2 INT) PARTITION BY HASH(v1) PARTITIONS 4;
    Query OK, 0 rows affected (0.12 sec)
    
    obclient>EXPLAIN SELECT * FROM t2, t3 WHERE t2.v1 = t3.v1\G;
    *************************** 1. row ***************************
    Query Plan: 
    ===========================================================
    |ID|OPERATOR               |NAME    |EST. ROWS |COST      |
    |0 |PX COORDINATOR         |        |1568160000|1227554264|
    |1 | EXCHANGE OUT DISTR    |:EX10000|1568160000|930670004 |
    |2 |  PX PARTITION ITERATOR|        |1568160000|930670004 |
    |3 |   MERGE JOIN          |        |1568160000|930670004 |
    |4 |    TABLE SCAN         |t2      |400000    |256226    |
    |5 |    TABLE SCAN         |t3      |400000    |256226    |
    ===========================================================
    
    Outputs & filters:
    -------------------------------------
      0 - output([t2.v1], [t2.v2], [t3.v1], [t3.v2]), filter(nil)
      1 - output([t2.v1], [t2.v2], [t3.v1], [t3.v2]), filter(nil), dop=1
      2 - output([t2.v1], [t2.v2], [t3.v1], [t3.v2]), filter(nil)
      3 - output([t2.v1], [t2.v2], [t3.v1], [t3.v2]), filter(nil),
          equal_conds([t2.v1 = t3.v1]), other_conds(nil)
      4 - output([t2.v1], [t2.v2]), filter(nil),
          access([t2.v1], [t2.v2]), partitions(p[0-3])
      5 - output([t3.v1], [t3.v2]), filter(nil),
          access([t3.v1], [t3.v2]), partitions(p[0-3])
  • Partial Partition-Wise Join

    当左右表中一个表为分区表,另一个表为非分区表,或者两者皆为分区表但是联接键仅和其中一个分区表的分区键相同的情况下,会以该分区表的分区分布为基准,重新分布另一个表的数据。如下例所示:

    obclient>CREATE TABLE t4 (v1 INT, v2 INT) PARTITION BY HASH(v1) PARTITIONS 3;
    Query OK, 0 rows affected (0.12 sec)
    
    obclient>EXPLAIN SELECT * FROM t4, t2 WHERE t2.v1 = t4.v1\G;
    *************************** 1. row ***************************
    Query Plan:
     ===========================================================
    |ID|OPERATOR                     |NAME    |EST. ROWS|COST |
    -----------------------------------------------------------
    |0 |PX COORDINATOR               |        |11880    |17658|
    |1 | EXCHANGE OUT DISTR          |:EX10001|11880    |15409|
    |2 |  NESTED-LOOP JOIN           |        |11880    |15409|
    |3 |   EXCHANGE IN DISTR         |        |3        |37   |
    |4 |    EXCHANGE OUT DISTR (PKEY)|:EX10000|3        |37   |
    |5 |     PX PARTITION ITERATOR   |        |3        |37   |
    |6 |      TABLE SCAN             |t4      |3        |37   |
    |7 |   PX PARTITION ITERATOR     |        |3960     |2561 |
    |8 |    TABLE SCAN               |t2      |3960     |2561 |
    ===========================================================
    
    Outputs & filters:
    -------------------------------------
      0 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil)
      1 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil), dop=1
      2 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil),
          conds(nil), nl_params_([t4.v1])
      3 - output([t4.v1], [t4.v2]), filter(nil)
      4 - (#keys=1, [t4.v1]), output([t4.v1], [t4.v2]), filter(nil), dop=1
      5 - output([t4.v1], [t4.v2]), filter(nil)
      6 - output([t4.v1], [t4.v2]), filter(nil),
          access([t4.v1], [t4.v2]), partitions(p[0-2])
      7 - output([t2.v1], [t2.v2]), filter(nil)
      8 - output([t2.v1], [t2.v2]), filter(nil),
          access([t2.v1], [t2.v2]), partitions(p[0-3])
  • 数据重分布

    当联接键和左右表的分区键都没有关系的情况下,可以根据规则计算来选择使用 BROADCAST 还是 HASH HASH 的数据重分布方式,如下例所示:

    注意 

    只有在并行度大于 1 时, 以下示例中两种数据重分发方式才有可能被选中。

    obclient>EXPLAIN SELECT /*+ PARALLEL(2)*/* FROM t4, t2 WHERE t2.v2 = t4.v2\G;
    *************************** 1. row ***************************
    Query Plan:
    =================================================================
    |ID|OPERATOR                          |NAME    |EST. ROWS|COST  |
    -----------------------------------------------------------------
    |0 |PX COORDINATOR                    |        |11880    |396863|
    |1 | EXCHANGE OUT DISTR               |:EX10001|11880    |394614|
    |2 |  HASH JOIN                       |        |11880    |394614|
    |3 |   EXCHANGE IN DISTR              |        |3        |37    |
    |4 |    EXCHANGE OUT DISTR (BROADCAST)|:EX10000|3        |37    |
    |5 |     PX BLOCK ITERATOR            |        |3        |37    |
    |6 |      TABLE SCAN                  |t4      |3        |37    |
    |7 |   PX PARTITION ITERATOR          |        |400000   |256226|
    |8 |    TABLE SCAN                    |t2      |400000   |256226|
    =================================================================
    
    Outputs & filters:
    -------------------------------------
      0 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil)
      1 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil), dop=2
      2 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil),
          equal_conds([t2.v2 = t4.v2]), other_conds(nil)
      3 - output([t4.v1], [t4.v2]), filter(nil)
      4 - output([t4.v1], [t4.v2]), filter(nil), dop=2
      5 - output([t4.v1], [t4.v2]), filter(nil)
      6 - output([t4.v1], [t4.v2]), filter(nil),
          access([t4.v1], [t4.v2]), partitions(p[0-2])
      7 - output([t2.v1], [t2.v2]), filter(nil)
      8 - output([t2.v1], [t2.v2]), filter(nil),
          access([t2.v1], [t2.v2]), partitions(p[0-3])
          
    
    obclient>EXPLAIN SELECT /*+ PQ_DISTRIBUTE(t2 HASH HASH) PARALLEL(2)*/* FROM t4, t2 
                  WHERE t2.v2 = t4.v2\G;
    *************************** 1. row ***************************
    Query Plan:
     ============================================================
    |ID|OPERATOR                     |NAME    |EST. ROWS|COST  |
    ------------------------------------------------------------
    |0 |PX COORDINATOR               |        |11880    |434727|
    |1 | EXCHANGE OUT DISTR          |:EX10002|11880    |432478|
    |2 |  HASH JOIN                  |        |11880    |432478|
    |3 |   EXCHANGE IN DISTR         |        |3        |37    |
    |4 |    EXCHANGE OUT DISTR (HASH)|:EX10000|3        |37    |
    |5 |     PX BLOCK ITERATOR       |        |3        |37    |
    |6 |      TABLE SCAN             |t4      |3        |37    |
    |7 |   EXCHANGE IN DISTR         |        |400000   |294090|
    |8 |    EXCHANGE OUT DISTR (HASH)|:EX10001|400000   |256226|
    |9 |     PX PARTITION ITERATOR   |        |400000   |256226|
    |10|      TABLE SCAN             |t2      |400000   |256226|
    ============================================================
    
    Outputs & filters:
    -------------------------------------
      0 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil)
      1 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil), dop=2
      2 - output([t4.v1], [t4.v2], [t2.v1], [t2.v2]), filter(nil),
          equal_conds([t2.v2 = t4.v2]), other_conds(nil)
      3 - output([t4.v1], [t4.v2]), filter(nil)
      4 - (#keys=1, [t4.v2]), output([t4.v1], [t4.v2]), filter(nil), dop=2
      5 - output([t4.v1], [t4.v2]), filter(nil)
      6 - output([t4.v1], [t4.v2]), filter(nil),
          access([t4.v1], [t4.v2]), partitions(p[0-2])
      7 - output([t2.v1], [t2.v2]), filter(nil)
      8 - (#keys=1, [t2.v2]), output([t2.v1], [t2.v2]), filter(nil), dop=2
      9 - output([t2.v1], [t2.v2]), filter(nil)
      10 - output([t2.v1], [t2.v2]), filter(nil),
          access([t2.v1], [t2.v2]), partitions(p[0-3])

文章来源网络,作者:运维,如若转载,请注明出处:https://shuyeidc.com/wp/286201.html<

(0)
运维的头像运维
上一篇2025-05-15 06:06
下一篇 2025-05-15 06:07

相关推荐

  • 个人主题怎么制作?

    制作个人主题是一个将个人风格、兴趣或专业领域转化为视觉化或结构化内容的过程,无论是用于个人博客、作品集、社交媒体账号还是品牌形象,核心都是围绕“个人特色”展开,以下从定位、内容规划、视觉设计、技术实现四个维度,详细拆解制作个人主题的完整流程,明确主题定位:找到个人特色的核心主题定位是所有工作的起点,需要先回答……

    2025-11-20
    0
  • 社群营销管理关键是什么?

    社群营销的核心在于通过建立有温度、有价值、有归属感的社群,实现用户留存、转化和品牌传播,其管理需贯穿“目标定位-内容运营-用户互动-数据驱动-风险控制”全流程,以下从五个维度展开详细说明:明确社群定位与目标社群管理的首要任务是精准定位,需明确社群的核心价值(如行业交流、产品使用指导、兴趣分享等)、目标用户画像……

    2025-11-20
    0
  • 香港公司网站备案需要什么材料?

    香港公司进行网站备案是一个涉及多部门协调、流程相对严谨的过程,尤其需兼顾中国内地与香港两地的监管要求,由于香港公司注册地与中国内地不同,其网站若主要服务内地用户或使用内地服务器,需根据服务器位置、网站内容性质等,选择对应的备案路径(如工信部ICP备案或公安备案),以下从备案主体资格、流程步骤、材料准备、注意事项……

    2025-11-20
    0
  • 如何企业上云推广

    企业上云已成为数字化转型的核心战略,但推广过程中需结合行业特性、企业痛点与市场需求,构建系统性、多维度的推广体系,以下从市场定位、策略设计、执行落地及效果优化四个维度,详细拆解企业上云推广的实践路径,精准定位:明确目标企业与核心价值企业上云并非“一刀切”的方案,需先锁定目标客户群体,提炼差异化价值主张,客户分层……

    2025-11-20
    0
  • PS设计搜索框的实用技巧有哪些?

    在PS中设计一个美观且功能性的搜索框需要结合创意构思、视觉设计和用户体验考量,以下从设计思路、制作步骤、细节优化及交互预览等方面详细说明,帮助打造符合需求的搜索框,设计前的规划明确使用场景:根据网站或APP的整体风格确定搜索框的调性,例如极简风适合细线条和纯色,科技感适合渐变和发光效果,电商类则可能需要突出搜索……

    2025-11-20
    0

发表回复

您的邮箱地址不会被公开。必填项已用 * 标注