Options¶

checkpointLocation¶

Checkpoint directory for streaming queries (Spark Structured Streaming).

dataChange¶

Whether to write new data to the table or just rearrange data that is already part of the table. This option declares that the data being written by this job does not change any data in the table and merely rearranges existing data. This makes sure streaming queries reading from this table will not see any new changes

Used when:

DeltaWriteOptionsImpl is requested for rearrangeOnly

Demo

Learn more in Demo: dataChange.

excludeRegex¶

scala.util.matching.Regex to filter out the paths of FileActions

Default: (undefined)

Use DeltaOptions.excludeRegex to access the value

Used when:

DeltaSourceBase is requested for the data (for a given DeltaSourceOffset)
DeltaSourceCDCSupport is requested for the data

failOnDataLoss¶

Controls whether or not to fail loading a delta table when the earliest available version (in the _delta_log directory) is after the version requested

Default: true

Use DeltaOptions.failOnDataLoss to access the value

ignoreChanges¶

ignoreDeletes¶

ignoreFileDeletion¶

maxBytesPerTrigger¶

maxFilesPerTrigger¶

Maximum number of files (AddFiles) that DeltaSource is supposed to scan (read) in a streaming micro-batch (trigger)

Default: 1000

Must be at least 1

maxRecordsPerFile¶

Maximum number of records per data file

Spark SQL

maxRecordsPerFile is amongst the FileFormatWriter (Spark SQL) options so all Delta Lake does is to let it be available (hand it over) to the underlying "writing infrastructure".

Used when:

TransactionalWrite is requested to write data out (for write options of DelayedCommitProtocol)

mergeSchema¶

Enables schema migration (and allows automatic schema merging during a write operation for WriteIntoDelta and DeltaSink)

Equivalent SQL Session configuration: spark.databricks.delta.schema.autoMerge.enabled

optimizeWrite¶

optimizeWrite is a writer option.

Not used

overwriteSchema¶

Enables overwriting schema or change partitioning of a delta table during an overwrite write operation

Use DeltaOptions.canOverwriteSchema to access the value

Note

The schema cannot be overwritten when using replaceWhere option.

partitionOverwriteMode¶

Mutually exclusive with replaceWhere

Used when:

DeltaDynamicPartitionOverwriteCommand is executed (and sets partitionOverwriteMode to DYNAMIC)
DeltaWriteOptionsImpl is requested to isDynamicPartitionOverwriteMode and for partitionOverwriteModeInOptions
WriteIntoDeltaBuilder is requested to overwriteDynamicPartitions

path¶

(required) Directory on a Hadoop DFS-compliant file system with an optional time travel identifier

Default: (undefined)

Note

Can also be specified using load method of DataFrameReader and DataStreamReader.

queryName¶

readChangeFeed¶

Enables Change Data Feed while reading delta tables (CDC-aware table scans)

Use DeltaOptions.readChangeFeed for the value

Note

Use the following options to fine-tune Change Data Feed-aware queries:

startingVersion
startingTimestamp
endingVersion
endingTimestamp

readChangeFeed is used when:

CDCStatementBase is requested to getOptions
CDCReaderImpl is requested to isCDCRead
DeltaDataSource is requested to create a BaseRelation

replaceWhere¶

Partition predicates to overwrite only the data that matches predicates over partition columns (unless replaceWhere.dataColumns.enabled is enabled)

Available as DeltaWriteOptions.replaceWhere

Mutually exclusive with partitionOverwriteMode

Demo

Learn more in Demo: replaceWhere.

streamingSourceTrackingId¶

The directory for a schema log of DeltaSourceMetadataTrackingLog

Available as DeltaOptions.sourceTrackingId

Used when:

DeltaAnalysis is requested to verifyDeltaSourceSchemaLocation

timestampAsOf¶

Timestamp of the version of a delta table for Time Travel

Mutually exclusive with versionAsOf option and the time travel identifier of the path option.

Used when:

DeltaDataSource utility is used to get a DeltaTimeTravelSpec

userMetadata¶

Defines a user-defined commit metadata

Take precedence over spark.databricks.delta.commitInfo.userMetadata

Available by inspecting CommitInfos using DESCRIBE HISTORY or DeltaTable.history.

Demo

Learn more in Demo: User Metadata for Labelling Commits.

versionAsOf¶

Version of a delta table for Time Travel

Must be castable to a long number

Mutually exclusive with timestampAsOf option and the time travel identifier of the path option.

Used when:

DeltaDataSource utility is used to get a DeltaTimeTravelSpec