Skip to content

Hidden File Metadata

Hidden File Metadata (Constant Metadata Columns) allows users to query the metadata of the input files for all file formats, expose them as built-in hidden columns meaning users can only see them when they explicitly reference them (e.g. file path, file name).

Hidden File Metadata is only available for FileFormat connectors.

MetadataAttribute is an AttributeReference with __metadata_col internal metadata.

Hidden File Metadata was introduced in Spark SQL 3.3.0.

Metadata Columns

Hidden File Metadata is logically a subset of Metadata Columns (that are a feature of FileTables).

__metadata_col Internal Metadata

__metadata_col is the name of an internal metadata (key).

__metadata_col is associated with the name of an Attribute when markAsQualifiedAccessOnly.

__metadata_col is removed when removing internal metadata.

__metadata_col is used when:

  • MetadataAttribute is requested to isValid
  • MetadataAttributeWithLogicalName is requested to unapply