Skip to content

DeltaOptions

DeltaOptions is an extension of DeltaWriteOptions and DeltaReadOptions for all supported options of the DeltaDataSource.

DeltaOptions is used to create WriteIntoDelta command, DeltaSink, and DeltaSource.

DeltaOptions can be verified.

Options

checkpointLocation

dataChange

excludeRegex

ignoreChanges

ignoreDeletes

ignoreFileDeletion

maxBytesPerTrigger

maxFilesPerTrigger

Maximum number of files (AddFiles) that DeltaSource is supposed to scan (read) in a streaming micro-batch (trigger)

Default: 1000

Must be at least 1

mergeSchema

Enables schema migration (and allows automatic schema merging during a write operation for WriteIntoDelta and DeltaSink)

Equivalent SQL Session configuration: spark.databricks.delta.schema.autoMerge.enabled

optimizeWrite

overwriteSchema

path

(required) Directory on a Hadoop DFS-compliant file system with an optional time travel identifier

Default: (undefined)

Note

Can also be specified using load method of DataFrameReader and DataStreamReader.

queryName

replaceWhere

timestampAsOf

Timestamp of the version of a Delta table for Time Travel

Mutually exclusive with versionAsOf option and the time travel identifier of the path option.

userMetadata

Defines a user-defined commit metadata

Take precedence over spark.databricks.delta.commitInfo.userMetadata

Available by inspecting CommitInfos using DESCRIBE HISTORY or DeltaTable.history.

versionAsOf

Version of a Delta table for Time Travel

Mutually exclusive with timestampAsOf option and the time travel identifier of the path option.

Used when:

  • DeltaDataSource is requested for a relation

Creating Instance

DeltaOptions takes the following to be created:

  • Case-Insensitive Options
  • SQLConf (Spark SQL)

When created, DeltaOptions verifies the input options.

DeltaOptions is created when:

How to Define Options

The options can be defined using option method of the following:

  • DataFrameReader and DataFrameWriter for batch queries (Spark SQL)
  • DataStreamReader and DataStreamWriter for streaming queries (Spark Structured Streaming)

Verifying Options

verifyOptions(
  options: CaseInsensitiveMap[String]): Unit

verifyOptions finds invalid options among the input options.

Note

In the open-source version verifyOptions does nothing. The underlying objects (recordDeltaEvent and the others) are no-ops.

verifyOptions is used when:


Last update: 2020-12-11