Skip to content

Options

Delta Lake comes with options to fine-tune its uses. They can be defined using option method of the following:

checkpointLocation

Checkpoint directory for storing checkpoint data of streaming queries (Spark Structured Streaming).

dataChange

Whether to write new data to the table or just rearrange data that is already part of the table. This option declares that the data being written by this job does not change any data in the table and merely rearranges existing data. This makes sure streaming queries reading from this table will not see any new changes

Used when:

Demo

Learn more in Demo: dataChange.

excludeRegex

ignoreChanges

ignoreDeletes

ignoreFileDeletion

maxBytesPerTrigger

maxFilesPerTrigger

Maximum number of files (AddFiles) that DeltaSource is supposed to scan (read) in a streaming micro-batch (trigger)

Default: 1000

Must be at least 1

mergeSchema

Enables schema migration (and allows automatic schema merging during a write operation for WriteIntoDelta and DeltaSink)

Equivalent SQL Session configuration: spark.databricks.delta.schema.autoMerge.enabled

optimizeWrite

Enables...FIXME

overwriteSchema

path

(required) Directory on a Hadoop DFS-compliant file system with an optional time travel identifier

Default: (undefined)

Note

Can also be specified using load method of DataFrameReader and DataStreamReader.

queryName

replaceWhere

Available as DeltaWriteOptions.replaceWhere

Demo

Learn more in Demo: replaceWhere.

timestampAsOf

Timestamp of the version of a Delta table for Time Travel

Mutually exclusive with versionAsOf option and the time travel identifier of the path option.

userMetadata

Defines a user-defined commit metadata

Take precedence over spark.databricks.delta.commitInfo.userMetadata

Available by inspecting CommitInfos using DESCRIBE HISTORY or DeltaTable.history.

versionAsOf

Version of a Delta table for Time Travel

Mutually exclusive with timestampAsOf option and the time travel identifier of the path option.

Used when:

  • DeltaDataSource is requested for a relation

Last update: 2021-04-05