DeltaColumnMappingBase (DeltaColumnMapping)¶
DeltaColumnMappingBase is an abstraction of DeltaColumnMappings.
Implementations¶
Compatible Protocol¶
DeltaColumnMappingBase defines a Protocol (with MIN_READER_VERSION and MIN_WRITER_VERSION) as the minimum protocol version for the readers and writers to delta tables with column mapping.
Protocolutility is used for requiredMinimumProtocol- delta.columnMapping.mode configuration property
- delta.columnMapping.maxColumnId configuration property
DeltaErrorsis requested to changeColumnMappingModeOnOldProtocol (for error reporting)
Minimum Reader Version¶
DeltaColumnMappingBase defines MIN_READER_VERSION constant as 2 for the minimum version of the compatible readers of delta tables to satisfyColumnMappingProtocol.
Minimum Writer Version¶
DeltaColumnMappingBase defines MIN_WRITER_VERSION constant as 5 for the minimum version of the compatible writers to delta tables to satisfyColumnMappingProtocol.
createPhysicalSchema¶
createPhysicalSchema(
schema: StructType,
referenceSchema: StructType,
columnMappingMode: DeltaColumnMappingMode,
checkSupportedMode: Boolean = true): StructType
createPhysicalSchema...FIXME
createPhysicalSchema is used when:
DeltaColumnMappingBaseis requested to checkColumnIdAndPhysicalNameAssignments and createPhysicalAttributesDeltaParquetFileFormatis requested to prepare a schema
renameColumns¶
renameColumns(
schema: StructType): StructType
renameColumns...FIXME
renameColumns is used when:
Metadatais requested for the physicalPartitionSchema
requiresNewProtocol¶
requiresNewProtocol(
metadata: Metadata): Boolean
requiresNewProtocol is true when the DeltaColumnMappingMode (of this delta table per the given Metadata) is either IdMapping or NameMapping. Otherwise, requiresNewProtocol is false
requiresNewProtocol is used when:
Protocolutility is used to determine the required minimum protocol.
checkColumnIdAndPhysicalNameAssignments¶
checkColumnIdAndPhysicalNameAssignments(
schema: StructType,
mode: DeltaColumnMappingMode): Unit
checkColumnIdAndPhysicalNameAssignments...FIXME
checkColumnIdAndPhysicalNameAssignments is used when:
OptimisticTransactionImplis requested to verify the new metadata
dropColumnMappingMetadata¶
dropColumnMappingMetadata(
schema: StructType): StructType
dropColumnMappingMetadata...FIXME
dropColumnMappingMetadata is used when:
DeltaLogis requested for a BaseRelation and for a DataFrameDeltaTableV2is requested for the tableSchema- AlterTableSetLocationDeltaCommand command is executed
- CreateDeltaTableCommand command is executed
ImplicitMetadataOperationis requested to update the metadata
Mapping Virtual to Physical Field Name¶
getPhysicalName(
field: StructField): String
getPhysicalName requests the given StructField (Spark SQL) for the Metadata to extract delta.columnMapping.physicalName key, if available (for column mapping). Otherwise, getPhysicalName returns the name of the given StructField (with no name changes).
getPhysicalName is used when:
CheckpointV2utility is used to extractPartitionValuesConflictCheckeris requested to getPrettyPartitionMessageDeltaColumnMappingBaseis requested to renameColumns, assignPhysicalNames and createPhysicalSchemaDeltaLogutility is used to rewritePartitionFilters- AlterTableChangeColumnDeltaCommand is executed
ConvertToDeltaCommandutility is used to create an AddFileTahoeFileIndexis requested to makePartitionDirectoriesDataSkippingReaderBaseis requested to getStatsColumnOptStatisticsCollectionis requested to collect statistics
verifyAndUpdateMetadataChange¶
verifyAndUpdateMetadataChange(
oldProtocol: Protocol,
oldMetadata: Metadata,
newMetadata: Metadata,
isCreatingNewTable: Boolean): Metadata
verifyAndUpdateMetadataChange...FIXME
In the end, verifyAndUpdateMetadataChange tryFixMetadata with the given newMetadata and oldMetadata metadata.
verifyAndUpdateMetadataChange is used when:
OptimisticTransactionImplis requested to updateMetadataInternal
tryFixMetadata¶
tryFixMetadata(
oldMetadata: Metadata,
newMetadata: Metadata,
isChangingModeOnExistingTable: Boolean): Metadata
tryFixMetadata reads columnMapping.mode table property from the given newMetadata table metadata.
If the DeltaColumnMappingMode is IdMapping or NameMapping, tryFixMetadata assignColumnIdAndPhysicalName with the given newMetadata and oldMetadata metadata and isChangingModeOnExistingTable flag.
For NoMapping, tryFixMetadata does nothing and returns the given newMetadata.
satisfyColumnMappingProtocol¶
satisfyColumnMappingProtocol(
protocol: Protocol): Boolean
satisfyColumnMappingProtocol returns true when all the following hold true:
- minWriterVersion of the given
Protocolis at least 5 - minReaderVersion of the given
Protocolis at least 2
Allowed Mapping Mode Change¶
allowMappingModeChange(
oldMode: DeltaColumnMappingMode,
newMode: DeltaColumnMappingMode): Boolean
allowMappingModeChange is true when either of the following holds true:
- There is no mode change (and the old and new modes are the same)
- There is a mode change from NoMapping old mode to NameMapping
Otherwise, allowMappingModeChange is false.
DeltaColumnMapping¶
DeltaColumnMapping is the only DeltaColumnMappingBase.
Supported Column Mapping Modes¶
supportedModes: Set[DeltaColumnMappingMode]
DeltaColumnMappingBase defines supportedModes value with NoMapping and NameMapping column mapping modes.
supportedModes is used when:
DeltaColumnMappingBaseis requested to verifyAndUpdateMetadataChange and createPhysicalSchema
getColumnMappingMetadata¶
getColumnMappingMetadata(
field: StructField,
mode: DeltaColumnMappingMode): Metadata
Note
getColumnMappingMetadata returns Spark SQL's Metadata not Delta Lake's.
getColumnMappingMetadata...FIXME
getColumnMappingMetadata is used when:
DeltaColumnMappingBaseis requested to setColumnMetadata and createPhysicalSchema