QueryPlanningTracker¶
QueryPlanningTracker is used to track the execution phases of a structured query.
| Phase | Description |
|---|---|
parsing | SparkSession is requested to execute a SQL query |
analysis | QueryExecution is requested for an analyzed query plan |
optimization | QueryExecution is requested for an optimized query plan |
planning | QueryExecution is requested for an physical and executed query plans |
Accessing QueryPlanningTracker¶
QueryPlanningTracker of a structured query is available using QueryExecution.
val df_ops = spark.range(1000).selectExpr("count(*)")
val tracker = df_ops.queryExecution.tracker
// There are three execution phases tracked for structured queries using Dataset API
assert(tracker.phases.keySet == Set("analysis", "optimization", "planning"))
val df_sql = sql("SELECT * FROM range(1000)")
val tracker = df_sql.queryExecution.tracker
// There are four execution phases tracked for structured queries using SQL
assert(tracker.phases.keySet == Set("parsing", "analysis", "optimization", "planning"))
Creating Instance¶
QueryPlanningTracker takes no arguments to be created.
QueryPlanningTracker is created when:
-
SparkSessionis requested to execute a SQL query -
QueryExecutionis created
Getting QueryPlanningTracker¶
get: Option[QueryPlanningTracker]
get utility allows to access the QueryPlanningTracker bound to the current thread (using a thread local variable facility).
import org.apache.spark.sql.catalyst.QueryPlanningTracker
scala> :type QueryPlanningTracker.get
Option[org.apache.spark.sql.catalyst.QueryPlanningTracker]
get is used when:
RuleExecutoris requested to execute rules on a query plan
Measuring Execution Phase¶
measurePhase[T](
phase: String)(
f: => T): T
measurePhase executes the given f executable block and records the start and end times in the phasesMap registry.
If the given phase has already been recorded in the phasesMap registry, measurePhase replaces the end time.
measurePhase is used when:
SparkSessionis requested to execute a SQL queryQueryExecutionis requested to executePhase
phasesMap Registry¶
phasesMap: HashMap[String, PhaseSummary]
QueryPlanningTracker creates phasesMap registry of phases and their start and end times (PhaseSummary) when created.
A phase with a PhaseSummary is added (recorded) in measurePhase.
phasesMap is available using phases method.
Execution Phases Summaries¶
phases: Map[String, PhaseSummary]
phases gives the phasesMap registry.
Note
phases sees to be used in tests only.