Skip to content

DeltaTableV2

DeltaTableV2 is a logical representation of a writable delta table.

Creating Instance

DeltaTableV2 takes the following to be created:

DeltaTableV2 is created when:

Table Metadata (CatalogTable)

catalogTable: Option[CatalogTable] = None

DeltaTableV2 can be given CatalogTable (Spark SQL) when created. It is undefined by default.

catalogTable is specified when:

catalogTable is used when:

CDC Options

cdcOptions: CaseInsensitiveStringMap

DeltaTableV2 can be given cdcOptions when created. It is empty by default (and most of the time).

cdcOptions is specified when:

cdcOptions is used when:

CDF-Aware Relation

cdcRelation: Option[BaseRelation]
Lazy Value

cdcRelation is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.

Learn more in the Scala Language Specification.

With CDF-aware read, cdcRelation returns a CDF-aware relation for the following:

Otherwise, cdcRelation returns None (an undefined value).


cdcRelation is used when:

Options

DeltaTableV2 can be given options (as a Map[String, String]). Options are empty by default.

The options are defined when DeltaDataSource is requested for a relation with spark.databricks.delta.loadFileSystemConfigsFromDataFrameOptions configuration property enabled.

The options are used for the following:

DeltaLog

DeltaTableV2 creates a DeltaLog for the rootPath and the given options.

Table

DeltaTableV2 is a Table (Spark SQL).

SupportsWrite

DeltaTableV2 is a SupportsWrite (Spark SQL).

V2TableWithV1Fallback

DeltaTableV2 is a V2TableWithV1Fallback (Spark SQL).

v1Table

V2TableWithV1Fallback
v1Table: CatalogTable

v1Table is part of the V2TableWithV1Fallback (Spark SQL) abstraction.

v1Table returns the CatalogTable (with CatalogStatistics removed if DeltaTimeTravelSpec has also been specified).


v1Table expects that the (optional) CatalogTable metadata is specified or throws a DeltaIllegalStateException:

v1Table call is not expected with path based DeltaTableV2

DeltaTimeTravelSpec

DeltaTableV2 may be given a DeltaTimeTravelSpec when created.

DeltaTimeTravelSpec is assumed not to be defined by default (None).

DeltaTableV2 is given a DeltaTimeTravelSpec when:

DeltaTimeTravelSpec is used for timeTravelSpec.

Properties

Table
properties(): Map[String, String]

properties is part of the Table (Spark SQL) abstraction.

properties requests the Snapshot for the table properties and adds the following:

Name Value
provider delta
location path
comment description (of the Metadata) if available
Type table type of the CatalogTable if available

Table Capabilities

Table
capabilities(): Set[TableCapability]

capabilities is part of the Table (Spark SQL) abstraction.

capabilities is the following:

Creating WriteBuilder

SupportsWrite
newWriteBuilder(
info: LogicalWriteInfo): WriteBuilder

newWriteBuilder is part of the SupportsWrite (Spark SQL) abstraction.

newWriteBuilder creates a WriteIntoDeltaBuilder (for the DeltaLog and the options from the LogicalWriteInfo).

Snapshot

snapshot: Snapshot

DeltaTableV2 has a Snapshot. In other words, DeltaTableV2 represents a Delta table at a specific version.

Lazy Value

snapshot is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.

Learn more in the Scala Language Specification.

DeltaTableV2 uses the DeltaLog to load it at a given version (based on the optional timeTravelSpec) or update to the latest version.


snapshot is used when:

DeltaTimeTravelSpec

timeTravelSpec: Option[DeltaTimeTravelSpec]

DeltaTableV2 may have a DeltaTimeTravelSpec specified that is either given or extracted from the path (for timeTravelByPath).

timeTravelSpec throws an AnalysisException when timeTravelOpt and timeTravelByPath are both defined:

Cannot specify time travel in multiple formats.

timeTravelSpec is used when:

DeltaTimeTravelSpec by Path

timeTravelByPath: Option[DeltaTimeTravelSpec]

Scala lazy value

timeTravelByPath is a Scala lazy value and is initialized once when first accessed. Once computed it stays unchanged.

timeTravelByPath is undefined when CatalogTable is defined.

With no CatalogTable defined, DeltaTableV2 parses the given Path for the timeTravelByPath (that resolvePath under the covers).

Converting to Insertable HadoopFsRelation

toBaseRelation: BaseRelation

toBaseRelation verifyAndCreatePartitionFilters for the Path, the current Snapshot and partitionFilters.

In the end, toBaseRelation requests the DeltaLog for an insertable HadoopFsRelation.


toBaseRelation is used when: