CreateViewCommand Logical Command¶
CreateViewCommand is a RunnableCommand that represents the following high-level operators (among other internal uses):
- CREATE VIEW SQL command
- Dataset.createGlobalTempView
- Dataset.createOrReplaceGlobalTempView
- Dataset.createOrReplaceTempView
- Dataset.createTempView
CreateViewCommand is an AnalysisOnlyCommand.
Creating Instance¶
CreateViewCommand takes the following to be created:
- Table name (
TableIdentifier) - User-specified columns (
Seq[(String, Option[String])]) - Optional comments
- Properties (
Map[String, String]) - Optional "Original Text"
- Logical query plan
-
allowExistingflag -
replaceflag -
ViewType -
isAnalyzedflag (default:false) -
referredTempFunctions
CreateViewCommand is created when:
- CacheTableAsSelectExec physical operator is executed (and requested for planToCache)
Datasetis requested to createTempViewCommand- ResolveSessionCatalog logical analysis rule is executed (to resolve CreateView logical operator)
SparkSqlAstBuilderis requested to parse a CREATE VIEW AS statement
Executing Command¶
RunnableCommand
run(
sparkSession: SparkSession): Seq[Row]
run is part of the RunnableCommand abstraction.
run requests the given SparkSession for the SessionCatalog (through the SessionState).
run branches off based on this ViewType.
For LocalTempView, run creates a temporary view relation and requests the SessionCatalog to create a local temporary view.
For GlobalTempView, run creates a temporary view relation and requests the SessionCatalog to create a global temporary view (in the global temporary views database per spark.sql.globalTempDatabase configuration property).
For a non-temporary view, run branches off based on whether the view name is registered already or not.
When the view name is in use already and this allowExisting flag is enabled, run does nothing.
When the view name is in use and this replace flag is enabled, run prints out the following DEBUG message to the logs:
Try to uncache [name] before replacing.
run then requests the Catalog to remove the table from the in-memory cache followed by the SessionCatalog to drop and (re)create it.
Extra Checks when View Name In Use
run may report exceptions with extra checks that are not covered here.
When neither temporary nor the view name is registered, run requests the SessionCatalog to create a (metastore) table.
In the end, run returns no rows (no metrics or similar).
AnalysisException
run throws an AnalysisException for the isAnalyzed flag disabled.
Preparing CatalogTable¶
prepareTable(
session: SparkSession,
analyzedPlan: LogicalPlan): CatalogTable
prepareTable creates a CatalogTable.
| Property Name | Property Value |
|---|---|
| identifier | Table name |
| tableType | VIEW |
| storage | Empty CatalogStorageFormat |
| schema | Aliased schema of the given LogicalPlan |
| properties | generateViewProperties |
| viewOriginalText | originalText |
| viewText | originalText |
| comment | comment |
AnalysisException
prepareTable reports an AnalysisException when this originalText is not defined.
Demo¶
val tableName = "demo_source_table"
// Demo table for "AS query" part
sql(s"CREATE TABLE ${tableName} AS SELECT * FROM VALUES 1,2,3 t(id)")
// The "AS" query
val asQuery = s"SELECT * FROM ${tableName}"
val viewName = "demo_view"
sql(s"CREATE OR REPLACE VIEW ${viewName} AS ${asQuery}")
sql("SHOW VIEWS").show(truncate = false)
+---------+---------+-----------+
|namespace|viewName |isTemporary|
+---------+---------+-----------+
|default |demo_view|false |
+---------+---------+-----------+
sql(s"DESC EXTENDED ${viewName}").show(truncate = false)
+----------------------------+---------------------------------------------------------+-------+
|col_name |data_type |comment|
+----------------------------+---------------------------------------------------------+-------+
|id |int |NULL |
| | | |
|# Detailed Table Information| | |
|Catalog |spark_catalog | |
|Database |default | |
|Table |demo_view | |
|Owner |jacek | |
|Created Time |Sun Apr 21 18:22:16 CEST 2024 | |
|Last Access |UNKNOWN | |
|Created By |Spark 3.5.1 | |
|Type |VIEW | |
|View Text |SELECT * FROM demo_source_table | |
|View Original Text |SELECT * FROM demo_source_table | |
|View Catalog and Namespace |spark_catalog.default | |
|View Query Output Columns |[id] | |
|Table Properties |[transient_lastDdlTime=1713716536] | |
|Serde Library |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | |
|InputFormat |org.apache.hadoop.mapred.SequenceFileInputFormat | |
|OutputFormat |org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat| |
|Storage Properties |[serialization.format=1] | |
+----------------------------+---------------------------------------------------------+-------+