DeltaCatalog¶
DeltaCatalog is a DelegatingCatalogExtension (Spark SQL) and a StagingTableCatalog.
DeltaCatalog is registered using spark.sql.catalog.spark_catalog (Spark SQL) configuration property.
StagingTableCatalog¶
DeltaCatalog is a StagingTableCatalog (Spark SQL) that creates a StagedDeltaTableV2 (for delta data source) or a BestEffortStagedTable.
stageCreate¶
StagingTableCatalog
stageCreate(
ident: Identifier,
schema: StructType,
partitions: Array[Transform],
properties: util.Map[String, String]): StagedTable
stageCreate is part of the StagingTableCatalog (Spark SQL) abstraction.
stageCreate creates a StagedDeltaTableV2 (with TableCreationModes.Create operation) for delta data source only (based on the given properties or spark.sql.sources.default configuration property).
Otherwise, stageCreate creates a BestEffortStagedTable (requesting the parent TableCatalog to create a table).
stageCreateOrReplace¶
StagingTableCatalog
stageCreateOrReplace(
ident: Identifier,
schema: StructType,
partitions: Array[Transform],
properties: util.Map[String, String]): StagedTable
stageCreateOrReplace is part of the StagingTableCatalog (Spark SQL) abstraction.
stageCreateOrReplace creates a StagedDeltaTableV2 (with TableCreationModes.CreateOrReplace operation) for delta data source only (based on the given properties or spark.sql.sources.default configuration property).
Otherwise, stageCreateOrReplace requests the parent TableCatalog to drop the table first and then creates a BestEffortStagedTable (requesting the parent TableCatalog to create the table).
stageReplace¶
StagingTableCatalog
stageReplace(
ident: Identifier,
schema: StructType,
partitions: Array[Transform],
properties: util.Map[String, String]): StagedTable
stageReplace is part of the StagingTableCatalog (Spark SQL) abstraction.
stageReplace creates a StagedDeltaTableV2 (with TableCreationModes.Replace operation) for delta data source only (based on the given properties or spark.sql.sources.default configuration property).
Otherwise, stageReplace requests the parent TableCatalog to drop the table first and then creates a BestEffortStagedTable (requesting the parent TableCatalog to create the table).
Altering Table¶
TableCatalog
alterTable(
ident: Identifier,
changes: TableChange*): Table
alterTable is part of the TableCatalog (Spark SQL) abstraction.
alterTable loads the table and continues only when it is a DeltaTableV2. Otherwise, alterTable delegates to the parent TableCatalog.
alterTable groups the given TableChanges by their (class) type.
In addition, alterTable collects the following ColumnChanges together (that are then executed as column updates as AlterTableChangeColumnDeltaCommand):
RenameColumnUpdateColumnCommentUpdateColumnNullabilityUpdateColumnPositionUpdateColumnType
alterTable executes the table changes as one of AlterDeltaTableCommands.
| TableChange | AlterDeltaTableCommand |
|---|---|
AddColumn | AlterTableAddColumnsDeltaCommand |
AddConstraint | AlterTableAddConstraintDeltaCommand |
ColumnChange | AlterTableChangeColumnDeltaCommand |
DropConstraint | AlterTableDropConstraintDeltaCommand |
RemoveProperty | AlterTableUnsetPropertiesDeltaCommand |
SetLocation( SetProperty with location property)catalog delta tables only | AlterTableSetLocationDeltaCommand |
SetProperty | AlterTableSetPropertiesDeltaCommand |
alterTable...FIXME
Creating Table¶
TableCatalog
createTable(
ident: Identifier,
schema: StructType,
partitions: Array[Transform],
properties: util.Map[String, String]): Table
createTable is part of the TableCatalog (Spark SQL) abstraction.
createTable...FIXME
Loading Table¶
TableCatalog
loadTable(
ident: Identifier): Table
loadTable is part of the TableCatalog (Spark SQL) abstraction.
loadTable loads a table by the given identifier from a catalog.
If found and the table is a delta table (Spark SQL's V1Table with delta provider), loadTable creates a DeltaTableV2.
Creating Delta Table¶
createDeltaTable(
ident: Identifier,
schema: StructType,
partitions: Array[Transform],
allTableProperties: Map[String, String],
writeOptions: Map[String, String],
sourceQuery: Option[DataFrame],
operation: TableCreationModes.CreationMode): Table
createDeltaTable...FIXME
createDeltaTable is used when:
DeltaCatalogis requested to create a tableStagedDeltaTableV2is requested to commitStagedChanges
Operation¶
createDeltaTable is given an argument of type TableCreationModes.CreationMode:
Createwhen DeltaCatalog creates a tableStagedDeltaTableV2is given a CreationMode when created
validateClusterBySpec¶
validateClusterBySpec(
maybeClusterBySpec: Option[ClusterBySpec],
schema: StructType): Unit
validateClusterBySpec...FIXME
Looking Up Table Provider¶
getProvider(
properties: util.Map[String, String]): String
getProvider takes the value of the provider from the given properties (if available) or defaults to the value of spark.sql.sources.default (Spark SQL) configuration property.
getProvider is used when:
DeltaCatalogis requested to createTable, stageReplace, stageCreateOrReplace and stageCreate