CacheTableAsSelectExec Physical Operator¶
CacheTableAsSelectExec
is a BaseCacheTableExec physical operator that represents CACHE TABLE
SQL command (as a CacheTableAsSelect
logical operator) at execution.
CACHE [LAZY] TABLE identifierReference
[OPTIONS key=value (, key=value)*]
[[AS] query]
When executed, CacheTableAsSelectExec
uses CreateViewCommand logical operator followed by SparkSession.table operator to create a LogicalPlan to cache.
In other words, CacheTableAsSelectExec
is a shorter version (shortcut) of executing CREATE VIEW SQL command (or the corresponding Dataset operators, e.g. Dataset.createTempView) followed by CACHE TABLE
(that boils down to requesting the session-wide CacheManager to cache this LogicalPlan to cache).
Creating Instance¶
CacheTableAsSelectExec
takes the following to be created:
- The name of the temporary view
- The LogicalPlan of the query
- Original SQL Text
-
isLazy
flag - Options (
Map[String, String]
) - Referred temporary functions (
Seq[String]
)
CacheTableAsSelectExec
is created when:
- DataSourceV2Strategy execution planning strategy is executed (to plan a
CacheTableAsSelect
logical operator)
Relation Name¶
relationName
is this name of the temporary view.
LogicalPlan to Cache¶
BaseCacheTableExec
planToCache: LogicalPlan
planToCache
is part of the BaseCacheTableExec abstraction.
Lazy Value
planToCache
is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.
Learn more in the Scala Language Specification.
planToCache
creates a CreateViewCommand logical operator that is immediately executed.
CreateViewCommand
CreateViewCommand | Value |
---|---|
Table name | this name |
Original Text | this original text |
Logical query plan | this query |
ViewType | LocalTempView |
In the end, planToCache
requests the dataFrameForCachedPlan for the logical plan.
dataFrameForCachedPlan¶
BaseCacheTableExec
dataFrameForCachedPlan: DataFrame
dataFrameForCachedPlan
is part of the BaseCacheTableExec abstraction.
Lazy Value
dataFrameForCachedPlan
is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.
Learn more in the Scala Language Specification.
dataFrameForCachedPlan
uses SparkSession.table operator to create a DataFrame that represents loading data from the temporary view.