ExplainUtils¶
ExplainUtils
is a utility to process a query plan (when QueryExecution
is requested for a simple (basic) text representation for formatted
explain mode).
Demo¶
val q = spark.range(5).join(spark.range(10), Seq("id"), "inner")
scala> q.explain(mode = "formatted")
== Physical Plan ==
AdaptiveSparkPlan (6)
+- Project (5)
+- BroadcastHashJoin Inner BuildLeft (4)
:- BroadcastExchange (2)
: +- Range (1)
+- Range (3)
(1) Range
Output [1]: [id#0L]
Arguments: Range (0, 5, step=1, splits=Some(16))
(2) BroadcastExchange
Input [1]: [id#0L]
Arguments: HashedRelationBroadcastMode(List(input[0, bigint, false]),false), [id=#16]
(3) Range
Output [1]: [id#2L]
Arguments: Range (0, 10, step=1, splits=Some(16))
(4) BroadcastHashJoin
Left keys [1]: [id#0L]
Right keys [1]: [id#2L]
Join condition: None
(5) Project
Output [1]: [id#0L]
Input [2]: [id#0L, id#2L]
(6) AdaptiveSparkPlan
Output [1]: [id#0L]
Arguments: isFinalPlan=false
Note that the AdaptiveSparkPlan physical operator has isFinalPlan flag false
(and you can see part of the final output).
Execute Adaptive Query Execution optimization.
q.take(0)
The isFinalPlan flag should now be true
.
scala> q.explain(mode = "formatted")
== Physical Plan ==
AdaptiveSparkPlan (10)
+- == Final Plan ==
* Project (6)
+- * BroadcastHashJoin Inner BuildLeft (5)
:- BroadcastQueryStage (3)
: +- BroadcastExchange (2)
: +- * Range (1)
+- * Range (4)
+- == Initial Plan ==
Project (9)
+- BroadcastHashJoin Inner BuildLeft (8)
:- BroadcastExchange (7)
: +- Range (1)
+- Range (4)
(1) Range [codegen id : 1]
Output [1]: [id#0L]
Arguments: Range (0, 5, step=1, splits=Some(16))
(2) BroadcastExchange
Input [1]: [id#0L]
Arguments: HashedRelationBroadcastMode(List(input[0, bigint, false]),false), [id=#72]
(3) BroadcastQueryStage
Output [1]: [id#0L]
Arguments: 0
(4) Range
Output [1]: [id#2L]
Arguments: Range (0, 10, step=1, splits=Some(16))
(5) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [id#0L]
Right keys [1]: [id#2L]
Join condition: None
(6) Project [codegen id : 2]
Output [1]: [id#0L]
Input [2]: [id#0L, id#2L]
(7) BroadcastExchange
Input [1]: [id#0L]
Arguments: HashedRelationBroadcastMode(List(input[0, bigint, false]),false), [id=#16]
(8) BroadcastHashJoin
Left keys [1]: [id#0L]
Right keys [1]: [id#2L]
Join condition: None
(9) Project
Output [1]: [id#0L]
Input [2]: [id#0L, id#2L]
(10) AdaptiveSparkPlan
Output [1]: [id#0L]
Arguments: isFinalPlan=true
Processing Query Plan¶
processPlan[T <: QueryPlan[T]](
plan: T,
append: String => Unit): Unit
processPlan
...FIXME
processPlan
is used when:
QueryExecution
is requested to simpleString
processPlanSkippingSubqueries¶
processPlanSkippingSubqueries[T <: QueryPlan[T]](
plan: T,
append: String => Unit,
collectedOperators: BitSet): Unit
processPlanSkippingSubqueries
...FIXME
collectOperatorsWithID¶
collectOperatorsWithID(
plan: QueryPlan[_],
operators: ArrayBuffer[QueryPlan[_]],
collectedOperators: BitSet): Unit
collectOperatorsWithID
...FIXME
removeTags¶
removeTags(
plan: QueryPlan[_]): Unit
removeTags
...FIXME
generateOperatorIDs¶
generateOperatorIDs(
plan: QueryPlan[_],
startOperatorID: Int): Int
generateOperatorIDs
...FIXME