Skip to content

ImplicitMetadataOperation

ImplicitMetadataOperation is an abstraction of operations that can update the metadata of a delta table (while writing out a new data).

ImplicitMetadataOperation operations can update schema by merging and overwriting schema.

Contract

canMergeSchema

canMergeSchema: Boolean

Used when:

canOverwriteSchema

canOverwriteSchema: Boolean

Used when:

Implementations

Updating Metadata

updateMetadata(
  spark: SparkSession,
  txn: OptimisticTransaction,
  schema: StructType,
  partitionColumns: Seq[String],
  configuration: Map[String, String],
  isOverwriteMode: Boolean,
  rearrangeOnly: Boolean): Unit

updateMetadata dropColumnMappingMetadata from the given schema (that produces dataSchema).

updateMetadata mergeSchema (with the dataSchema and the isOverwriteMode and canOverwriteSchema flags).

updateMetadata normalizePartitionColumns.

updateMetadata branches off based on the following conditions:

  1. Delta table is just being created
  2. Overwriting schema is enabled (i.e. isOverwriteMode and canOverwriteSchema flags are enabled, and either the schema is new or partitioning changed)
  3. Merging schema is enabled the schema is new and the canMergeSchema is enabled (but the partitioning has not changed)
  4. Data or Partitioning Schema has changed

Table Being Created

updateMetadata creates a new Metadata with the following:

  • Uses the value of comment key (in the configuration) for the description
  • FIXME

updateMetadata requests the given OptimisticTransaction to updateMetadata.

Overwriting Schema

updateMetadata...FIXME

Merging Schema

updateMetadata...FIXME

New Data or Partitioning Schema

updateMetadata...FIXME

isOverwriteMode

updateMetadata is given isOverwriteMode flag as follows:

rearrangeOnly

updateMetadata is given rearrangeOnly flag as follows:

configuration

updateMetadata is given configuration as follows:

Usage

updateMetadata is used when:

Normalizing Partition Columns

normalizePartitionColumns(
  spark: SparkSession,
  partitionCols: Seq[String],
  schema: StructType): Seq[String]

normalizePartitionColumns...FIXME

mergeSchema

mergeSchema(
  txn: OptimisticTransaction,
  dataSchema: StructType,
  isOverwriteMode: Boolean,
  canOverwriteSchema: Boolean): StructType

mergeSchema...FIXME