Skip to content

DeltaConfigs (DeltaConfigsBase)

DeltaConfigs holds the supported table properties in Delta Lake.

Accessing DeltaConfigs

import org.apache.spark.sql.delta.OptimisticTransaction
val txn: OptimisticTransaction = ???
import org.apache.spark.sql.delta.actions.Metadata
val metadata: Metadata = txn.metadata
import org.apache.spark.sql.delta.DeltaConfigs
DeltaConfigs.CHANGE_DATA_FEED.fromMetaData(metadata)

System-Wide Defaults

spark.databricks.delta.properties.defaults prefix is used for global table properties.

For every table property (without the delta. prefix) there is the corresponding system-wide (global) configuration property with spark.databricks.delta.properties.defaults prefix for the default values of the table properties for all delta tables.

Table Properties

All table properties start with delta. prefix.

appendOnly

delta.appendOnly

Turns a table into append-only

When enabled, a table allows appends only and no updates or deletes.

Default: false

Used when:

autoOptimize

delta.autoOptimize

Deprecated

delta.autoOptimize is deprecated in favour of delta.autoOptimize.autoCompact table property since 3.1.0.

Whether this delta table will automagically optimize the layout of files during writes.

Default: false

autoOptimize.autoCompact

delta.autoOptimize.autoCompact

Enables Auto Compaction

Default: false

Replaces delta.autoOptimize

delta.autoOptimize.autoCompact replaces delta.autoOptimize.autoCompact table property since 3.1.0.

Used when:

checkpointInterval

How often to checkpoint the state of a delta table (at the end of transaction commit)

Default: 10

checkpointRetentionDuration

delta.checkpointRetentionDuration

How long to keep checkpoint files around before deleting them

Default: interval 2 days

The most recent checkpoint is never deleted. It is acceptable to keep checkpoint files beyond this duration until the next calendar day.

checkpoint.writeStatsAsJson

delta.checkpoint.writeStatsAsJson

Controls whether to write file statistics in the checkpoint in JSON format as the stats column.

Default: true

checkpoint.writeStatsAsStruct

delta.checkpoint.writeStatsAsStruct

Controls whether to write file statistics in the checkpoint in the struct format in the stats_parsed column and partition values as a struct as partitionValues_parsed

Default: undefined (Option[Boolean])

columnMapping.maxColumnId

delta.columnMapping.maxColumnId

Maximum columnId used in the schema so far for column mapping

Cannot be set

Default: 0

columnMapping.mode

delta.columnMapping.mode

DeltaColumnMappingMode to read and write parquet data files

Name Description
none (default) A display name is the only valid identifier of a column
id A column ID is the identifier of a column. This mode is used for tables converted from Iceberg and parquet files in this mode will also have corresponding field Ids for each column in their file schema.
name The physical column name is the identifier of a column. Stored as part of StructField metadata in the schema. Used for reading statistics and partition values in the DeltaLog.

Used when:

compatibility.symlinkFormatManifest.enabled

delta.compatibility.symlinkFormatManifest.enabled

Whether to register the GenerateSymlinkManifest post-commit hook while committing a transaction or not

Default: false

dataSkippingNumIndexedCols

delta.dataSkippingNumIndexedCols

The number of columns to collect stats on for data skipping. -1 means collecting stats for all columns.

Default: 32

Must be larger than or equal to -1.

Used when:

deletedFileRetentionDuration

delta.deletedFileRetentionDuration

How long to keep logically deleted data files around before deleting them physically (to prevent failures in stale readers after compactions or partition overwrites)

Default: interval 1 week

enableChangeDataFeed

delta.enableChangeDataFeed

Enables Change Data Feed

Default: false

Legacy configuration: enableChangeDataCapture

Used when:

enableDeletionVectors

delta.enableDeletionVectors

Enables Deletion Vectors

Default: false

Used when:

enableExpiredLogCleanup

delta.enableExpiredLogCleanup

Controls Log Cleanup

Default: true

Used when:

enableFullRetentionRollback

delta.enableFullRetentionRollback

Controls whether or not a delta table can be rolled back to any point within logRetentionDuration. When disabled, the table can be rolled back checkpointRetentionDuration only.

Default: true

enableRowTracking

delta.enableRowTracking

Default: false

Used when:

logRetentionDuration

delta.logRetentionDuration

How long to keep obsolete logs around before deleting them. Delta can keep logs beyond the duration until the next calendar day to avoid constantly creating checkpoints.

Default: interval 30 days (CalendarInterval)

Examples: 2 weeks, 365 days (months and years are not accepted)

Used when:

minReaderVersion

delta.minReaderVersion

The protocol reader version

Default: 1

This property is not stored as a table property in the Metadata action. It is stored as its own action. Having it modelled as a table property makes it easier to upgrade, and view the version.

minWriterVersion

delta.minWriterVersion

The protocol reader version

Default: 3

This property is not stored as a table property in the Metadata action. It is stored as its own action. Having it modelled as a table property makes it easier to upgrade, and view the version.

randomizeFilePrefixes

delta.randomizeFilePrefixes

Whether to use a random prefix in a file path instead of partition information (may be required for very high volume S3 calls to better be partitioned across S3 servers)

Default: false

randomPrefixLength

delta.randomPrefixLength

The length of the random prefix in a file path for randomizeFilePrefixes

Default: 2

sampleRetentionDuration

delta.sampleRetentionDuration

How long to keep delta sample files around before deleting them

Default: interval 7 days

universalFormat.enabledFormats

delta.universalFormat.enabledFormats

A comma-separated list of table formats

Default: (empty)

Supported values:

  • hudi
  • iceberg

Used when:

Building Configuration

buildConfig[T](
  key: String,
  defaultValue: String,
  fromString: String => T,
  validationFunction: T => Boolean,
  helpMessage: String,
  minimumProtocolVersion: Option[Protocol] = None): DeltaConfig[T]

buildConfig creates a DeltaConfig for the given key (with delta prefix added) and adds it to the entries internal registry.

buildConfig is used to define all of the configuration properties in a type-safe way and (as a side effect) register them with the system-wide entries internal registry.

System-Wide Configuration Entries Registry

entries: HashMap[String, DeltaConfig[_]]

DeltaConfigs utility (a Scala object) uses entries internal registry of DeltaConfigs by their key.

New entries are added in buildConfig.

entries is used when:

mergeGlobalConfigs

mergeGlobalConfigs(
  sqlConfs: SQLConf,
  tableConf: Map[String, String],
  protocol: Protocol): Map[String, String]

mergeGlobalConfigs finds all spark.databricks.delta.properties.defaults-prefixed table properties among the entries.


mergeGlobalConfigs is used when:

validateConfigurations

validateConfigurations(
  configurations: Map[String, String]): Map[String, String]

validateConfigurations...FIXME


validateConfigurations is used when:

normalizeConfigKeys

normalizeConfigKeys(
  propKeys: Seq[String]): Seq[String]

normalizeConfigKeys...FIXME


normalizeConfigKeys is used when: