Skip to content

Protocol

Protocol is an Action with TableFeatureSupport.

Creating Instance

Protocol takes the following to be created:

  • Minimum Reader Version required to read this delta table
  • Minimum Writer Version required to write to this delta table.
  • Reader features that need to be supported to read this delta table (optional)
  • Writer features that need to be supported to write to this delta table (optional)

Protocol is created using apply and forTableFeature factories.

Create Protocol

apply(
  minReaderVersion: Int = Action.readerVersion,
  minWriterVersion: Int = Action.writerVersion): Protocol

apply creates a Protocol for the given minReaderVersion and minWriterVersion.

apply supportsReaderFeatures and supportsWriterFeatures to initialize readerFeatures and writerFeatures.


apply is used when:

forTableFeature

forTableFeature(
  tf: TableFeature): Protocol

forTableFeature...FIXME


forTableFeature is used when:

  • IcebergCompatBase is requested to enforceInvariantsAndDependencies

forNewTable

forNewTable(
  spark: SparkSession,
  metadataOpt: Option[Metadata]): Protocol

forNewTable creates a new Protocol for the given SparkSession and Metadata.


forNewTable is used when:

  • OptimisticTransactionImpl is requested to updateMetadataInternal
  • DummySnapshot is requested for the protocol

checkProtocolRequirements

checkProtocolRequirements(
  spark: SparkSession,
  metadata: Metadata,
  current: Protocol): Option[Protocol]

checkProtocolRequirements asserts that the table configuration does not contain delta.minReaderVersion or throws an AssertionError:

Should not have the protocol version (delta.minReaderVersion) as part of table properties

checkProtocolRequirements asserts that the table configuration does not contain delta.minWriterVersion or throws an AssertionError:

Should not have the protocol version (delta.minWriterVersion) as part of table properties

checkProtocolRequirements determines the required minimum protocol.

checkProtocolRequirements...FIXME

checkProtocolRequirements is used when:

Required Minimum Protocol

requiredMinimumProtocol(
  spark: SparkSession,
  metadata: Metadata): (Protocol, Seq[String])

requiredMinimumProtocol creates a Protocol with 0 for the minimum reader and writer versions.

Protocol(0, 0)

requiredMinimumProtocol tracks features used (in featuresUsed).

requiredMinimumProtocol determines the required minimum Protocol checking for the following features (in order):

  1. Column-Level Invariants
  2. Append Only Table
  3. CHECK Constraints
  4. Generated Columns
  5. Change Data Feed
  6. IDENTITY Columns (Unsupported)
  7. Column Mapping

In the end, requiredMinimumProtocol returns the required Protocol and the features used.


requiredMinimumProtocol is used when:

Column Invariants

requiredMinimumProtocol checks for column-level invariants (in the schema of the given Metadata).

If used, requiredMinimumProtocol sets the minWriterVersion to 2.

Protocol(0, 2)

Append-Only Table

requiredMinimumProtocol reads appendOnly table property (from the table configuration of the given Metadata).

If set, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 3.

Protocol(0, 3)

CHECK Constraints

requiredMinimumProtocol checks for CHECK constraints (in the given Metadata).

If used, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 3.

Protocol(0, 3)

Generated Columns

requiredMinimumProtocol checks for generated columns (in the schema of the given Metadata).

If used, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 4.

Protocol(0, 4)

Change Data Feed

requiredMinimumProtocol checks whether delta.enableChangeDataFeed table property is enabled (in the given Metadata).

If enabled, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 4.

Protocol(0, 4)

IDENTITY Columns (Unsupported)

requiredMinimumProtocol checks for identity columns (in the schema of the given Metadata).

If used, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 6.

Protocol(0, 6)

AnalysisException

In the end, requiredMinimumProtocol throws an AnalysisException:

IDENTITY column is not supported

Column Mapping

requiredMinimumProtocol checks for column mapping (in the given Metadata).

If used, requiredMinimumProtocol creates a new Protocol.

Protocol(2, 5)

extractAutomaticallyEnabledFeatures

extractAutomaticallyEnabledFeatures(
  spark: SparkSession,
  metadata: Metadata,
  protocol: Option[Protocol] = None): Set[TableFeature]

extractAutomaticallyEnabledFeatures requests the given Protocol for the writerFeatureNames (protocol-enabled table features).

extractAutomaticallyEnabledFeatures finds FeatureAutomaticallyEnabledByMetadatas features (among the allSupportedFeaturesMap) that metadataRequiresFeatureToBeEnabled for the given Metadata (metadata-enabled table features)

In the end, extractAutomaticallyEnabledFeatures finds the smallest set of table features for the protocol- and metadata-enabled table features (incl. their dependencies, if there are any).


extractAutomaticallyEnabledFeatures is used when:

minProtocolComponentsFromMetadata

minProtocolComponentsFromMetadata(
  spark: SparkSession,
  metadata: Metadata): (Int, Int, Set[TableFeature])

minProtocolComponentsFromMetadata...FIXME


minProtocolComponentsFromMetadata is used when:

upgradeProtocolFromMetadataForExistingTable

upgradeProtocolFromMetadataForExistingTable(
  spark: SparkSession,
  metadata: Metadata): (Int, Int, Set[TableFeature])

upgradeProtocolFromMetadataForExistingTable...FIXME


upgradeProtocolFromMetadataForExistingTable is used when:

minProtocolComponentsFromAutomaticallyEnabledFeatures

minProtocolComponentsFromAutomaticallyEnabledFeatures(
  spark: SparkSession,
  metadata: Metadata): (Int, Int, Set[TableFeature])

minProtocolComponentsFromAutomaticallyEnabledFeatures determines the minimum reader and writer versions based on automatically enabled table features.

Demo

import org.apache.spark.sql.delta.actions.{Metadata, Protocol}
import org.apache.spark.sql.delta.DeltaConfigs

val configuration = Map(
  DeltaConfigs.IS_APPEND_ONLY.key -> "true") // (1)!
val metadata = Metadata(configuration = configuration)
val protocol = Protocol.forNewTable(spark, metadata)
  1. Append-only table
assert(
  protocol.minReaderVersion == 1,
  "minReaderVersion should be the default 1")
assert(
  protocol.minWriterVersion == 2,
  "minWriterVersion should be 2 because of append-only tables")