Skip to content

Protocol

Protocol is an Action.

Creating Instance

Protocol takes the following to be created:

  • Minimum Reader Version Allowed (default: 1)
  • Minimum Writer Version Allowed (default: 3)

Protocol is created when:

forNewTable

forNewTable(
  spark: SparkSession,
  metadata: Metadata): Protocol

forNewTable creates a new Protocol for the given SparkSession and Metadata.

forNewTable is used when:

apply

apply(
  spark: SparkSession,
  metadataOpt: Option[Metadata]): Protocol

apply...FIXME

checkProtocolRequirements

checkProtocolRequirements(
  spark: SparkSession,
  metadata: Metadata,
  current: Protocol): Option[Protocol]

checkProtocolRequirements asserts that the table configuration does not contain delta.minReaderVersion or throws an AssertionError:

Should not have the protocol version (delta.minReaderVersion) as part of table properties

checkProtocolRequirements asserts that the table configuration does not contain delta.minWriterVersion or throws an AssertionError:

Should not have the protocol version (delta.minWriterVersion) as part of table properties

checkProtocolRequirements determines the required minimum protocol.

checkProtocolRequirements...FIXME

checkProtocolRequirements is used when:

Required Minimum Protocol

requiredMinimumProtocol(
  spark: SparkSession,
  metadata: Metadata): (Protocol, Seq[String])

requiredMinimumProtocol creates a Protocol with 0 for the minimum reader and writer versions.

Protocol(0, 0)

requiredMinimumProtocol tracks features used (in featuresUsed).

requiredMinimumProtocol determines the required minimum Protocol checking for the following features (in order):

  1. Column-Level Invariants
  2. Append Only Table
  3. CHECK Constraints
  4. Generated Columns
  5. Change Data Feed
  6. IDENTITY Columns (Unsupported)
  7. Column Mapping

In the end, requiredMinimumProtocol returns the required Protocol and the features used.


requiredMinimumProtocol is used when:

Column Invariants

requiredMinimumProtocol checks for column-level invariants (in the schema of the given Metadata).

If used, requiredMinimumProtocol sets the minWriterVersion to 2.

Protocol(0, 2)

Append-Only Table

requiredMinimumProtocol reads appendOnly table property (from the table configuration of the given Metadata).

If set, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 3.

Protocol(0, 3)

CHECK Constraints

requiredMinimumProtocol checks for CHECK constraints (in the given Metadata).

If used, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 3.

Protocol(0, 3)

Generated Columns

requiredMinimumProtocol checks for generated columns (in the schema of the given Metadata).

If used, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 4.

Protocol(0, 4)

Change Data Feed

requiredMinimumProtocol checks whether delta.enableChangeDataFeed table property is enabled (in the given Metadata).

If enabled, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 4.

Protocol(0, 4)

IDENTITY Columns (Unsupported)

requiredMinimumProtocol checks for identity columns (in the schema of the given Metadata).

If used, requiredMinimumProtocol creates a new Protocol with the minWriterVersion to be 6.

Protocol(0, 6)

AnalysisException

In the end, requiredMinimumProtocol throws an AnalysisException:

IDENTITY column is not supported

Column Mapping

requiredMinimumProtocol checks for column mapping (in the given Metadata).

If used, requiredMinimumProtocol creates a new Protocol.

Protocol(2, 5)

extractAutomaticallyEnabledFeatures

extractAutomaticallyEnabledFeatures(
  spark: SparkSession,
  metadata: Metadata,
  protocol: Option[Protocol] = None): Set[TableFeature]

extractAutomaticallyEnabledFeatures requests the given Protocol for the writerFeatureNames (protocol-enabled table features).

extractAutomaticallyEnabledFeatures finds FeatureAutomaticallyEnabledByMetadatas features (among the allSupportedFeaturesMap) that metadataRequiresFeatureToBeEnabled for the given Metadata (metadata-enabled table features)

In the end, extractAutomaticallyEnabledFeatures finds the smallest set of table features for the protocol- and metadata-enabled table features (incl. their dependencies, if there are any).


extractAutomaticallyEnabledFeatures is used when:

minProtocolComponentsFromMetadata

minProtocolComponentsFromMetadata(
  spark: SparkSession,
  metadata: Metadata): (Int, Int, Set[TableFeature])

minProtocolComponentsFromMetadata...FIXME


minProtocolComponentsFromMetadata is used when:

upgradeProtocolFromMetadataForExistingTable

upgradeProtocolFromMetadataForExistingTable(
  spark: SparkSession,
  metadata: Metadata): (Int, Int, Set[TableFeature])

upgradeProtocolFromMetadataForExistingTable...FIXME


upgradeProtocolFromMetadataForExistingTable is used when:

minProtocolComponentsFromAutomaticallyEnabledFeatures

minProtocolComponentsFromAutomaticallyEnabledFeatures(
  spark: SparkSession,
  metadata: Metadata): (Int, Int, Set[TableFeature])

minProtocolComponentsFromAutomaticallyEnabledFeatures determines the minimum reader and writer versions based on automatically enabled table features.

Demo

import org.apache.spark.sql.delta.actions.{Metadata, Protocol}
import org.apache.spark.sql.delta.DeltaConfigs

val configuration = Map(
  DeltaConfigs.IS_APPEND_ONLY.key -> "true") // (1)!
val metadata = Metadata(configuration = configuration)
val protocol = Protocol.forNewTable(spark, metadata)
  1. Append-only table
assert(
  protocol.minReaderVersion == 1,
  "minReaderVersion should be the default 1")
assert(
  protocol.minWriterVersion == 2,
  "minWriterVersion should be 2 because of append-only tables")