DataSourceV2Relation Leaf Logical Operator¶
DataSourceV2Relation
is a leaf logical operator that represents a data scan over tables with support for BATCH_READ (at the very least).
Creating Instance¶
DataSourceV2Relation
takes the following to be created:
- Table
- Output
AttributeReference
s - (optional) CatalogPlugin
- (optional)
Identifier
- Case-Insensitive Options
DataSourceV2Relation
is created (indirectly) using create utility and withMetadataColumns.
Creating DataSourceV2Relation¶
create(
table: Table,
catalog: Option[CatalogPlugin],
identifier: Option[Identifier]): DataSourceV2Relation
create(
table: Table,
catalog: Option[CatalogPlugin],
identifier: Option[Identifier],
options: CaseInsensitiveStringMap): DataSourceV2Relation
create
replaces CharType
and VarcharType
types in the schema of the given Table with "annotated" StringType
(as the query engine doesn't support char/varchar).
In the end, create
uses the new schema to create a DataSourceV2Relation.
create
is used when:
CatalogV2Util
utility is used to loadRelationDataFrameWriter
is requested to insertInto, saveAsTable and saveInternalDataSourceV2Strategy
execution planning strategy is requested to invalidateCacheRenameTableExec
physical command is executed- ResolveTables logical resolution rule is executed (and requested to lookupV2Relation)
- ResolveRelations logical resolution rule is executed (and requested to lookupRelation)
DataFrameReader
is requested to load data
MultiInstanceRelation¶
DataSourceV2Relation
is a MultiInstanceRelation.
Metadata Columns¶
metadataOutput: Seq[AttributeReference]
metadataOutput
is part of the LogicalPlan abstraction.
metadataOutput
requests the Table for the metadata columns (if it is a SupportsMetadataColumns).
metadataOutput
filters out metadata columns with the same name as regular output columns.
Creating DataSourceV2Relation with Metadata Columns¶
withMetadataColumns(): DataSourceV2Relation
withMetadataColumns
creates a DataSourceV2Relation with the extra metadataOutput (for the output attributes) if defined.
withMetadataColumns
is used when:
- AddMetadataColumns logical resolution rule is executed
Required Table Capabilities¶
TableCapabilityCheck is used to assert the following regarding DataSourceV2Relation
and the Table:
- Table supports BATCH_READ
- Table supports BATCH_WRITE or V1_BATCH_WRITE for AppendData (append in batch mode)
- Table supports BATCH_WRITE with OVERWRITE_DYNAMIC for OverwritePartitionsDynamic (dynamic overwrite in batch mode)
- Table supports BATCH_WRITE, V1_BATCH_WRITE or OVERWRITE_BY_FILTER possibly with TRUNCATE for OverwriteByExpression (truncate in batch mode and overwrite by filter in batch mode)
Name¶
name: String
name
is part of the NamedRelation abstraction.
name
requests the Table for the name
Simple Node Description¶
simpleString(
maxFields: Int): String
simpleString
is part of the TreeNode abstraction.
simpleString
gives the following (with the output and the name):
RelationV2[output] [name]