ImplicitMetadataOperation — Operations Updating Metadata (Schema And Partitioning)

ImplicitMetadataOperation is an abstraction of operations that can update metadata of a delta table (while writing out a new data to a delta table).

ImplicitMetadataOperation operations can update schema by merging and overwriting schema.

Table 1. ImplicitMetadataOperation Contract (Abstract Methods Only)
Method Description

canMergeSchema

canMergeSchema: Boolean

Used when ImplicitMetadataOperation is requested to updateMetadata

canOverwriteSchema

canOverwriteSchema: Boolean

Used when ImplicitMetadataOperation is requested to updateMetadata

Table 2. ImplicitMetadataOperations
ImplicitMetadataOperation Description

WriteIntoDelta

Delta command for batch queries (Spark SQL)

DeltaSink

Streaming sink for streaming queries (Spark Structured Streaming)

Updating Metadata — updateMetadata Method

updateMetadata(
  txn: OptimisticTransaction,
  data: Dataset[_],
  partitionColumns: Seq[String],
  configuration: Map[String, String],
  isOverwriteMode: Boolean): Unit

updateMetadata…​FIXME

updateMetadata is used when:

Normalize Partition Columns — normalizePartitionColumns Internal Method

normalizePartitionColumns(
  spark: SparkSession,
  partitionCols: Seq[String],
  schema: StructType): Seq[String]

normalizePartitionColumns…​FIXME

normalizePartitionColumns is used when ImplicitMetadataOperation is requested to updateMetadata.