CacheTableAsSelectExec Physical Operator¶
CacheTableAsSelectExec is a BaseCacheTableExec physical operator that represents CACHE TABLE SQL command (as a CacheTableAsSelect logical operator) at execution.
CACHE [LAZY] TABLE identifierReference
[OPTIONS key=value (, key=value)*]
[[AS] query]
When executed, CacheTableAsSelectExec uses CreateViewCommand logical operator followed by SparkSession.table operator to create a LogicalPlan to cache.
In other words, CacheTableAsSelectExec is a shorter version (shortcut) of executing CREATE VIEW SQL command (or the corresponding Dataset operators, e.g. Dataset.createTempView) followed by CACHE TABLE (that boils down to requesting the session-wide CacheManager to cache this LogicalPlan to cache).
Creating Instance¶
CacheTableAsSelectExec takes the following to be created:
- The name of the temporary view
- The LogicalPlan of the query
- Original SQL Text
-
isLazyflag - Options (
Map[String, String]) - Referred temporary functions (
Seq[String])
CacheTableAsSelectExec is created when:
- DataSourceV2Strategy execution planning strategy is executed (to plan a
CacheTableAsSelectlogical operator)
Relation Name¶
relationName is this name of the temporary view.
LogicalPlan to Cache¶
BaseCacheTableExec
planToCache: LogicalPlan
planToCache is part of the BaseCacheTableExec abstraction.
Lazy Value
planToCache is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.
Learn more in the Scala Language Specification.
planToCache creates a CreateViewCommand logical operator that is immediately executed.
CreateViewCommand
| CreateViewCommand | Value |
|---|---|
| Table name | this name |
| Original Text | this original text |
| Logical query plan | this query |
| ViewType | LocalTempView |
In the end, planToCache requests the dataFrameForCachedPlan for the logical plan.
dataFrameForCachedPlan¶
BaseCacheTableExec
dataFrameForCachedPlan: DataFrame
dataFrameForCachedPlan is part of the BaseCacheTableExec abstraction.
Lazy Value
dataFrameForCachedPlan is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.
Learn more in the Scala Language Specification.
dataFrameForCachedPlan uses SparkSession.table operator to create a DataFrame that represents loading data from the temporary view.