Skip to content

DeltaColumnMappingBase (DeltaColumnMapping)

DeltaColumnMappingBase is an abstraction of DeltaColumnMappings.

Implementations

Compatible Protocol

DeltaColumnMappingBase defines a Protocol (with MIN_READER_VERSION and MIN_WRITER_VERSION) as the minimum protocol version for the readers and writers to delta tables with column mapping.

Minimum Reader Version

DeltaColumnMappingBase defines MIN_READER_VERSION constant as 2 for the minimum version of the compatible readers of delta tables to satisfyColumnMappingProtocol.

Minimum Writer Version

DeltaColumnMappingBase defines MIN_WRITER_VERSION constant as 5 for the minimum version of the compatible writers to delta tables to satisfyColumnMappingProtocol.

createPhysicalSchema

createPhysicalSchema(
  schema: StructType,
  referenceSchema: StructType,
  columnMappingMode: DeltaColumnMappingMode,
  checkSupportedMode: Boolean = true): StructType

createPhysicalSchema...FIXME

createPhysicalSchema is used when:

renameColumns

renameColumns(
  schema: StructType): StructType

renameColumns...FIXME

renameColumns is used when:

requiresNewProtocol

requiresNewProtocol(
  metadata: Metadata): Boolean

requiresNewProtocol is true when the DeltaColumnMappingMode (of this delta table per the given Metadata) is either IdMapping or NameMapping. Otherwise, requiresNewProtocol is false

requiresNewProtocol is used when:

checkColumnIdAndPhysicalNameAssignments

checkColumnIdAndPhysicalNameAssignments(
  schema: StructType,
  mode: DeltaColumnMappingMode): Unit

checkColumnIdAndPhysicalNameAssignments...FIXME

checkColumnIdAndPhysicalNameAssignments is used when:

dropColumnMappingMetadata

dropColumnMappingMetadata(
  schema: StructType): StructType

dropColumnMappingMetadata...FIXME

dropColumnMappingMetadata is used when:

Mapping Virtual to Physical Field Name

getPhysicalName(
  field: StructField): String

getPhysicalName requests the given StructField (Spark SQL) for the Metadata to extract delta.columnMapping.physicalName key, if available (for column mapping). Otherwise, getPhysicalName returns the name of the given StructField (with no name changes).

getPhysicalName is used when:

verifyAndUpdateMetadataChange

verifyAndUpdateMetadataChange(
  oldProtocol: Protocol,
  oldMetadata: Metadata,
  newMetadata: Metadata,
  isCreatingNewTable: Boolean): Metadata

verifyAndUpdateMetadataChange...FIXME

In the end, verifyAndUpdateMetadataChange tryFixMetadata with the given newMetadata and oldMetadata metadata.

verifyAndUpdateMetadataChange is used when:

tryFixMetadata

tryFixMetadata(
  oldMetadata: Metadata,
  newMetadata: Metadata,
  isChangingModeOnExistingTable: Boolean): Metadata

tryFixMetadata reads columnMapping.mode table property from the given newMetadata table metadata.

If the DeltaColumnMappingMode is IdMapping or NameMapping, tryFixMetadata assignColumnIdAndPhysicalName with the given newMetadata and oldMetadata metadata and isChangingModeOnExistingTable flag.

For NoMapping, tryFixMetadata does nothing and returns the given newMetadata.

satisfyColumnMappingProtocol

satisfyColumnMappingProtocol(
  protocol: Protocol): Boolean

satisfyColumnMappingProtocol returns true when all the following hold true:

  1. minWriterVersion of the given Protocol is at least 5
  2. minReaderVersion of the given Protocol is at least 2

Allowed Mapping Mode Change

allowMappingModeChange(
  oldMode: DeltaColumnMappingMode,
  newMode: DeltaColumnMappingMode): Boolean

allowMappingModeChange is true when either of the following holds true:

  1. There is no mode change (and the old and new modes are the same)
  2. There is a mode change from NoMapping old mode to NameMapping

Otherwise, allowMappingModeChange is false.

DeltaColumnMapping

DeltaColumnMapping is the only DeltaColumnMappingBase.

Supported Column Mapping Modes

supportedModes: Set[DeltaColumnMappingMode]

DeltaColumnMappingBase defines supportedModes value with NoMapping and NameMapping column mapping modes.

supportedModes is used when:

getColumnMappingMetadata

getColumnMappingMetadata(
  field: StructField,
  mode: DeltaColumnMappingMode): Metadata

Note

getColumnMappingMetadata returns Spark SQL's Metadata not Delta Lake's.

getColumnMappingMetadata...FIXME

getColumnMappingMetadata is used when: