Metadata Columns¶
Spark 3.1.1 (SPARK-31255) introduced support for MetadataColumns for additional metadata of a row.
MetadataColumn
s can be defined for Tables with SupportsMetadataColumns.
Use DESCRIBE TABLE EXTENDED SQL command to display the metadata columns of a table.
__metadata_col¶
__metadata_col
is used when:
MetadataAttribute
is created and destructuredFileSourceMetadataAttribute
is created and requested to removeInternalMetadataFileSourceConstantMetadataAttribute
is createdFileSourceGeneratedMetadataAttribute
is createdMetadataColumnHelper
is requested to isMetadataCol and markAsQualifiedAccessOnlyMetadataColumnsHelper
is requested to asStruct
Logical Operators¶
Logical operators propagate metadata columns using metadataOutput.
ExposesMetadataColumns logical operators can generate metadata columns.
DataSourceV2Relation¶
MetadataColumn
s are disregarded (filtered out) from the metadataOutput in DataSourceV2Relation leaf logical operator when in name-conflict with output columns.