Skip to content

DataSourceV2Relation Leaf Logical Operator

DataSourceV2Relation is a leaf logical operator that represents a scan over tables with support for BATCH_READ (at the very least).

DataSourceV2Relation is an ExposesMetadataColumns and can add extra metadata columns to the output columns.

Creating Instance

DataSourceV2Relation takes the following to be created:

DataSourceV2Relation is created (indirectly) using create utility (and withMetadataColumns).

CatalogPlugin

DataSourceV2Relation can be given a CatalogPlugin when created.

The CatalogPlugin can be as follows:

Creating DataSourceV2Relation

create(
  table: Table,
  catalog: Option[CatalogPlugin],
  identifier: Option[Identifier]): DataSourceV2Relation
create(
  table: Table,
  catalog: Option[CatalogPlugin],
  identifier: Option[Identifier],
  options: CaseInsensitiveStringMap): DataSourceV2Relation

create replaces CharType and VarcharType types in the schema of the given Table with "annotated" StringType (as the query engine doesn't support char/varchar).

In the end, create uses the new schema to create a DataSourceV2Relation.


create is used when:

Metadata Columns

LogicalPlan
metadataOutput: Seq[AttributeReference]

metadataOutput is part of the LogicalPlan abstraction.

metadataOutput checks out whether this Table is a SupportsMetadataColumns. If so, metadataOutput requests this Table for metadata columns.

Otherwise, metadataOutput returns no metadata columns (Nil).

Lazy Value

metadataOutput is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.

Learn more in the Scala Language Specification.

Add Metadata Columns to Output Columns

ExposesMetadataColumns
withMetadataColumns(): DataSourceV2Relation

withMetadataColumns is part of the ExposesMetadataColumns abstraction.

withMetadataColumns creates a DataSourceV2Relation with the extra metadata columns added (if there are any) to this output columns.

Required Table Capabilities

TableCapabilityCheck is used to assert the following regarding DataSourceV2Relation and the Table:

  1. Table supports BATCH_READ
  2. Table supports BATCH_WRITE or V1_BATCH_WRITE for AppendData (append in batch mode)
  3. Table supports BATCH_WRITE with OVERWRITE_DYNAMIC for OverwritePartitionsDynamic (dynamic overwrite in batch mode)
  4. Table supports BATCH_WRITE, V1_BATCH_WRITE or OVERWRITE_BY_FILTER possibly with TRUNCATE for OverwriteByExpression (truncate in batch mode and overwrite by filter in batch mode)