Skip to content

Explaining Query Plans Improved

New in 3.0.0

Spark 3 comes with new output modes for explaining query plans (using EXPLAIN SQL statement or Dataset.explain operator).

EXPLAIN SQL Examples

Visit explain.sql for SQL examples of EXPLAIN SQL statement.

SPARK-27395

JIRA issue: [SPARK-27395] New format of EXPLAIN command

Example 1

EXPLAIN
SELECT key, max(val)
FROM
    SELECT col1 key, col2 val
    FROM VALUES (0, 0), (0, 1), (1, 2))
WHERE key > 0
GROUP BY key
HAVING max(val) > 0
== Physical Plan ==
*(2) Project [key#10, max(val)#20]
+- *(2) Filter (isnotnull(max(val#11)#23) AND (max(val#11)#23 > 0))
   +- *(2) HashAggregate(keys=[key#10], functions=[max(val#11)])
      +- Exchange hashpartitioning(key#10, 200), true, [id=#32]
         +- *(1) HashAggregate(keys=[key#10], functions=[partial_max(val#11)])
            +- *(1) LocalTableScan [key#10, val#11]

Example 2

EXPLAIN FORMATTED
SELECT (SELECT avg(a) FROM s1) + (SELECT avg(a) FROM s1)
FROM s1
LIMIT 1;