用 EXPLAIN 查看使用子查询的执行计划

    TiDB 会执行多种,以提升子查询的执行性能。本文档介绍一些常见子查询的优化方式,以及如何解读 EXPLAIN 语句返回的执行计划信息。

    本文档所使用的示例表数据如下:

    以下示例中,IN 子查询会从表 t2 中搜索一列 ID。为保证语义正确性,TiDB 需要保证 t1_id 列的值具有唯一性。使用 EXPLAIN 可查看到该查询的执行计划去掉重复项并执行 Inner Join 内连接操作:

    1. EXPLAIN SELECT * FROM t1 WHERE id IN (SELECT t1_id FROM t2);
    1. +----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
    2. | id | estRows | task | access object | operator info |
    3. +----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
    4. | IndexJoin_14 | 5.00 | root | | inner join, inner:IndexLookUp_13, outer key:test.t2.t1_id, inner key:test.t1.id, equal cond:eq(test.t2.t1_id, test.t1.id) |
    5. | ├─StreamAgg_49(Build) | 5.00 | root | | group by:test.t2.t1_id, funcs:firstrow(test.t2.t1_id)->test.t2.t1_id |
    6. | └─IndexReader_50 | 5.00 | root | | index:StreamAgg_39 |
    7. | └─StreamAgg_39 | 5.00 | cop[tikv] | | group by:test.t2.t1_id, |
    8. | └─IndexFullScan_31 | 50000.00 | cop[tikv] | table:t2, index:t1_id(t1_id) | keep order:true |
    9. | ├─IndexRangeScan_11(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t2.t1_id)], keep order:false |
    10. | └─TableRowIDScan_12(Probe) | 1.00 | cop[tikv] | table:t1 | keep order:false |
    11. +----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
    12. 8 rows in set (0.00 sec)

    在上述示例中,为了确保 t1_id 值在与表 t1 连接前具有唯一性,需要执行聚合运算。在以下示例中,由于 UNIQUE 约束已能确保 t3.t1_id 列值的唯一:

    1. +----------------------------------+---------+-----------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------+
    2. | id | estRows | task | access object | operator info |
    3. +----------------------------------+---------+-----------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------+
    4. | IndexJoin_17 | 1978.13 | root | | inner join, inner:IndexLookUp_16, outer key:test.t3.t1_id, inner key:test.t1.id, equal cond:eq(test.t3.t1_id, test.t1.id) |
    5. | ├─TableReader_44(Build) | 1978.00 | root | | data:TableFullScan_43 |
    6. | └─TableFullScan_43 | 1978.00 | cop[tikv] | table:t3 | keep order:false |
    7. | └─IndexLookUp_16(Probe) | 1.00 | root | | |
    8. | ├─IndexRangeScan_14(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t3.t1_id)], keep order:false |
    9. | └─TableRowIDScan_15(Probe) | 1.00 | cop[tikv] | table:t1 | keep order:false |
    10. 6 rows in set (0.01 sec)

    从语义上看,因为约束保证了 t3.t1_id 列值的唯一性,TiDB 可以直接执行 INNER JOIN 查询。

    在前两个示例中,通过 HashAgg 聚合操作或通过 UNIQUE 约束保证子查询数据的唯一性之后,TiDB 才能够执行 Inner Join 操作。这两种连接均使用了 Index Join

    1. EXPLAIN SELECT * FROM t1 WHERE id IN (SELECT t1_id FROM t2 WHERE t1_id != t1.int_col);

    由上述查询结果可知,TiDB 执行了 Semi Join。不同于 Inner JoinSemi Join 仅允许右键 () 上的第一个值,也就是该操作将去除 Join 算子任务中的重复数据。Join 算法也包含 Merge Join,会按照排序顺序同时从左侧和右侧读取数据,这是一种高效的 Zipper Merge

    可以将原语句视为关联子查询,因为它引入了子查询外的 t1.int_col 列。然而,EXPLAIN 语句的返回结果显示的是关联子查询去关联后的执行计划。条件 t1_id != t1.int_col 会被重写为 t1.id != t1.int_col。TiDB 可以从表 t1 中读取数据并且在 └─Selection_21 中执行此操作,因此这种去关联和重写操作会极大提高执行效率。

    在以下示例中,除非子查询中存在 t3.t1_id,否则该查询将(从语义上)返回表 t3 中的所有行:

    1. EXPLAIN SELECT * FROM t3 WHERE t1_id NOT IN (SELECT id FROM t1 WHERE int_col < 100);
    1. +----------------------------------+---------+-----------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+
    2. | id | estRows | task | access object | operator info |
    3. +----------------------------------+---------+-----------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+
    4. | IndexJoin_14 | 1582.40 | root | | anti semi join, inner:IndexLookUp_13, outer key:test.t3.t1_id, inner key:test.t1.id, equal cond:eq(test.t3.t1_id, test.t1.id) |
    5. | ├─TableReader_35(Build) | 1978.00 | root | | data:TableFullScan_34 |
    6. | └─TableFullScan_34 | 1978.00 | cop[tikv] | table:t3 | keep order:false |
    7. | └─IndexLookUp_13(Probe) | 1.00 | root | | |
    8. | ├─IndexRangeScan_10(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t3.t1_id)], keep order:false |
    9. | └─Selection_12(Probe) | 1.00 | cop[tikv] | | lt(test.t1.int_col, 100) |
    10. | └─TableRowIDScan_11 | 1.00 | cop[tikv] | table:t1 | keep order:false |
    11. 7 rows in set (0.00 sec)