Skip to content

DeltaTableV2

DeltaTableV2 is a logical representation of a writable Delta table.

In Spark SQL 3's terms, DeltaTableV2 is a Table (Spark SQL) that SupportsWrite (Spark SQL).

Creating Instance

DeltaTableV2 takes the following to be created:

DeltaTableV2 is created when:

V2TableWithV1Fallback

DeltaTableV2 is a V2TableWithV1Fallback (Spark SQL).

DeltaTimeTravelSpec

DeltaTableV2 may be given a DeltaTimeTravelSpec when created.

DeltaTimeTravelSpec is assumed not to be defined by default (None).

DeltaTableV2 is given a DeltaTimeTravelSpec when:

DeltaTimeTravelSpec is used for timeTravelSpec.

Properties

properties(): Map[String, String]

properties is part of the Table (Spark SQL) abstraction.

properties requests the Snapshot for the table properties and adds the following:

Name Value
provider delta
location path
comment description (of the Metadata) if available
Type table type of the CatalogTable if available

Table Capabilities

capabilities(): Set[TableCapability]

capabilities is part of the Table (Spark SQL) abstraction.

capabilities is the following:

Creating WriteBuilder

newWriteBuilder(
  info: LogicalWriteInfo): WriteBuilder

newWriteBuilder is part of the SupportsWrite (Spark SQL) abstraction.

newWriteBuilder creates a WriteIntoDeltaBuilder (for the DeltaLog and the options from the LogicalWriteInfo).

Snapshot

snapshot: Snapshot

DeltaTableV2 has a Snapshot. In other words, DeltaTableV2 represents a Delta table at a specific version.

Scala lazy value

snapshot is a Scala lazy value and is initialized once when first accessed. Once computed it stays unchanged.

DeltaTableV2 uses the DeltaLog to load it at a given version (based on the optional timeTravelSpec) or update to the latest version.

snapshot is used when DeltaTableV2 is requested for the schema, partitioning and properties.

DeltaTimeTravelSpec

timeTravelSpec: Option[DeltaTimeTravelSpec]

DeltaTableV2 may have a DeltaTimeTravelSpec specified that is either given or extracted from the path (for timeTravelByPath).

timeTravelSpec throws an AnalysisException when timeTravelOpt and timeTravelByPath are both defined:

Cannot specify time travel in multiple formats.

timeTravelSpec is used when DeltaTableV2 is requested for a Snapshot and BaseRelation.

DeltaTimeTravelSpec by Path

timeTravelByPath: Option[DeltaTimeTravelSpec]

Scala lazy value

timeTravelByPath is a Scala lazy value and is initialized once when first accessed. Once computed it stays unchanged.

timeTravelByPath is undefined when CatalogTable is defined.

With no CatalogTable defined, DeltaTableV2 parses the given Path for the timeTravelByPath (that resolvePath under the covers).

Converting to Insertable HadoopFsRelation

toBaseRelation: BaseRelation

toBaseRelation verifyAndCreatePartitionFilters for the Path, the current Snapshot and partitionFilters.

In the end, toBaseRelation requests the DeltaLog for an insertable HadoopFsRelation.

toBaseRelation is used when:

  • DeltaDataSource is requested to createRelation
  • DeltaRelation utility is used to fromV2Relation

Last update: 2021-06-01
Back to top