DeltaOptions

DeltaOptions (aka DeltaWriteOptionsImpl, DeltaWriteOptions) is the options for the Delta data source.

The options can be defined using option method of DataFrameReader, DataFrameWriter, DataStreamReader, and DataStreamWriter.

DeltaOptions is used to create WriteIntoDelta command, DeltaSink, and DeltaSource.

DeltaOptions can be verified.

Options

checkpointLocation

dataChange

excludeRegex

ignoreChanges

ignoreDeletes

ignoreFileDeletion

maxBytesPerTrigger

maxFilesPerTrigger

Maximum number of files (AddFiles) that DeltaSource will scan (read) in a streaming micro-batch (trigger)

Default: 1000

Must be at least 1

mergeSchema

Enables schema migration (e.g. allows automatic schema merging during a write operation for WriteIntoDelta and DeltaSink)

Equivalent SQL Session configuration: spark.databricks.delta.schema.autoMerge.enabled

optimizeWrite

overwriteSchema

path

(required) Directory on a Hadoop DFS-compliant file system with an optional time travel identifier.

Default: (undefined)

Can also be specified using load method of DataFrameReader and DataStreamReader.

queryName

replaceWhere

timestampAsOf

Time traveling using a timestamp of a table

Mutually exclusive with versionAsOf option and the time travel identifier of the path option.

userMetadata

Available by inspecting CommitInfos using DESCRIBE HISTORY or DeltaTable.history.

versionAsOf

Time traveling using a version of a table

Mutually exclusive with timestampAsOf option and the time travel identifier of the path option.

Used exclusively when DeltaDataSource is requested for a relation (as a RelationProvider)

Creating Instance

DeltaOptions takes the following to be created:

  • Options (Map[String, String] or CaseInsensitiveMap[String])

  • SQLConf

DeltaOptions is created when:

verifyOptions Utility

verifyOptions(
  options: CaseInsensitiveMap[String]): Unit

verifyOptions…​FIXME

verifyOptions is used when: