DeltaConfigs — Configuration Properties Of Delta Table (Metadata)

DeltaConfigs defines reservoir configuration properties (aka table properties).

Table properties can be assigned a value using ALTER TABLE SQL command:

ALTER TABLE <table_name> SET TBLPROPERTIES (<key>=<value>)

DeltaConfigs uses spark.databricks.delta.properties.defaults prefix for global configuration properties.

Table 1. Reservoir Configuration Properties
Key Description

appendOnly

Whether a delta table is append-only (true) or not (false). When enabled, a table allows appends only and no updates or deletes.

Default: false

autoOptimize

Whether this delta table will automagically optimize the layout of files during writes.

Default: false

checkpointInterval

How often to checkpoint the state of a delta table

Default: 10

checkpointRetentionDuration

How long to keep checkpoint files around before deleting them

Default: interval 2 days

The most recent checkpoint is never deleted. It is acceptable to keep checkpoint files beyond this duration until the next calendar day.

compatibility.symlinkFormatManifest.enabled

Whether to register the GenerateSymlinkManifest post-commit hook while committing a transaction or not

Default: false

dataSkippingNumIndexedCols

The number of columns to collect stats on for data skipping. -1 means collecting stats for all columns.

Default: 32

deletedFileRetentionDuration

How long to keep logically deleted data files around before deleting them physically (to prevent failures in stale readers after compactions or partition overwrites)

Default: interval 1 week

enableExpiredLogCleanup

Whether to clean up expired log files and checkpoints

Default: true

enableFullRetentionRollback

When enabled (default), a delta table can be rolled back to any point within logRetentionDuration. When disabled, the table can be rolled back checkpointRetentionDuration only.

Default: true

logRetentionDuration

How long to keep obsolete logs around before deleting them. Delta can keep logs beyond the duration until the next calendar day to avoid constantly creating checkpoints.

Default: interval 30 days

randomizeFilePrefixes

Whether to use a random prefix in a file path instead of partition information (may be required for very high volume S3 calls to better be partitioned across S3 servers)

Default: false

randomPrefixLength

The length of the random prefix in a file path for randomizeFilePrefixes

Default: 2

sampleRetentionDuration

How long to keep delta sample files around before deleting them

Default: interval 7 days

mergeGlobalConfigs Utility

mergeGlobalConfigs(
  sqlConfs: SQLConf,
  tableConf: Map[String, String],
  protocol: Protocol): Map[String, String]

mergeGlobalConfigs finds all spark.databricks.delta.properties.defaults-prefixed configuration properties among the entries.

mergeGlobalConfigs is used when OptimisticTransactionImpl is requested for the metadata, to update the metadata, and prepare a commit (for new delta tables).

Creating DeltaConfig Instance — buildConfig Internal Utility

buildConfig[T](
  key: String,
  defaultValue: String,
  fromString: String => T,
  validationFunction: T => Boolean,
  helpMessage: String,
  minimumProtocolVersion: Option[Protocol] = None): DeltaConfig[T]

buildConfig creates a DeltaConfig for the given key (with delta prefix) and adds it to the entries internal registry.

buildConfig is used to define all of the reservoir configuration properties.

Internal Properties

Name Description

entries

HashMap[String, DeltaConfig[_]]