Skip to content

DeltaConfigs (DeltaConfigsBase)

DeltaConfigs holds the table properties that can be set on a delta table.

Configuration Properties

appendOnly

Whether a delta table is append-only (true) or not (false). When enabled, a table allows appends only and no updates or deletes.

Default: false

Used when:

autoOptimize

Whether this delta table will automagically optimize the layout of files during writes.

Default: false

checkpointInterval

How often to checkpoint the state of a delta table (at the end of transaction commit)

Default: 10

checkpointRetentionDuration

How long to keep checkpoint files around before deleting them

Default: interval 2 days

The most recent checkpoint is never deleted. It is acceptable to keep checkpoint files beyond this duration until the next calendar day.

checkpoint.writeStatsAsJson

Controls whether to write file statistics in the checkpoint in JSON format as the stats column.

Default: true

checkpoint.writeStatsAsStruct

Controls whether to write file statistics in the checkpoint in the struct format in the stats_parsed column and partition values as a struct as partitionValues_parsed

Default: undefined (Option[Boolean])

compatibility.symlinkFormatManifest.enabled

Whether to register the GenerateSymlinkManifest post-commit hook while committing a transaction or not

Default: false

dataSkippingNumIndexedCols

The number of columns to collect stats on for data skipping. -1 means collecting stats for all columns.

Default: 32

deletedFileRetentionDuration

How long to keep logically deleted data files around before deleting them physically (to prevent failures in stale readers after compactions or partition overwrites)

Default: interval 1 week

enableExpiredLogCleanup

Whether to clean up expired log files and checkpoints

Default: true

enableFullRetentionRollback

Controls whether or not a delta table can be rolled back to any point within logRetentionDuration. When disabled, the table can be rolled back checkpointRetentionDuration only.

Default: true

logRetentionDuration

How long to keep obsolete logs around before deleting them. Delta can keep logs beyond the duration until the next calendar day to avoid constantly creating checkpoints.

Default: interval 30 days (CalendarInterval)

minReaderVersion

The protocol reader version

Default: 1

This property is not stored as a table property in the Metadata action. It is stored as its own action. Having it modelled as a table property makes it easier to upgrade, and view the version.

minWriterVersion

The protocol reader version

Default: 3

This property is not stored as a table property in the Metadata action. It is stored as its own action. Having it modelled as a table property makes it easier to upgrade, and view the version.

randomizeFilePrefixes

Whether to use a random prefix in a file path instead of partition information (may be required for very high volume S3 calls to better be partitioned across S3 servers)

Default: false

randomPrefixLength

The length of the random prefix in a file path for randomizeFilePrefixes

Default: 2

sampleRetentionDuration

How long to keep delta sample files around before deleting them

Default: interval 7 days

Building Configuration

buildConfig[T](
  key: String,
  defaultValue: String,
  fromString: String => T,
  validationFunction: T => Boolean,
  helpMessage: String,
  minimumProtocolVersion: Option[Protocol] = None): DeltaConfig[T]

buildConfig creates a DeltaConfig for the given key (with delta prefix added) and adds it to the entries internal registry.

buildConfig is used to define all of the configuration properties in a type-safe way and (as a side effect) register them with the system-wide entries internal registry.

System-Wide Configuration Entries Registry

entries: HashMap[String, DeltaConfig[_]]

DeltaConfigs utility (a Scala object) uses entries internal registry of DeltaConfigs by their key.

New entries are added in buildConfig.

entries is used when:

mergeGlobalConfigs Utility

mergeGlobalConfigs(
  sqlConfs: SQLConf,
  tableConf: Map[String, String],
  protocol: Protocol): Map[String, String]

mergeGlobalConfigs finds all spark.databricks.delta.properties.defaults-prefixed configuration properties among the entries.

mergeGlobalConfigs is used when:

validateConfigurations Utility

validateConfigurations(
  configurations: Map[String, String]): Map[String, String]

validateConfigurations...FIXME

validateConfigurations is used when:

normalizeConfigKeys Utility

normalizeConfigKeys(
  propKeys: Seq[String]): Seq[String]

normalizeConfigKeys...FIXME

normalizeConfigKeys is used when:

spark.databricks.delta.properties.defaults Prefix

DeltaConfigs uses spark.databricks.delta.properties.defaults prefix for global configuration properties.


Last update: 2021-06-12
Back to top