FileDataSourceV2 Table Providers¶
FileDataSourceV2
is an extension of the TableProvider abstraction for file-based table providers.
Contract¶
fallbackFileFormat¶
fallbackFileFormat: Class[_ <: FileFormat]
A V1 FileFormat class of this file-based data source
See:
Used when:
DDLUtils
is requested tocheckDataColNames
DataSource
is requested for the providingClass (for resolving data source relation for catalog tables)PreprocessTableCreation
logical analysis rule is executed
Table¶
getTable(
options: CaseInsensitiveStringMap): Table
getTable(
options: CaseInsensitiveStringMap,
schema: StructType): Table
getTable(
schema: StructType,
partitioning: Array[Transform],
properties: Map[String, String]): Table // (1)!
- Part of the TableProvider abstraction
A Table of this table provider
See:
Used when:
FileDataSourceV2
is requested for a table (as a TableProvider) and inferSchema
Implementations¶
AvroDataSourceV2
CSVDataSourceV2
JsonDataSourceV2
OrcDataSourceV2
- ParquetDataSourceV2
TextDataSourceV2
DataSourceRegister¶
FileDataSourceV2
is a DataSourceRegister.
Schema Inference¶
inferSchema(
options: CaseInsensitiveStringMap): StructType
inferSchema
is part of the TableProvider abstraction.
inferSchema
requests the Table for the schema.
If not available, inferSchema
creates a Table and "saves" it for later (in t registry).
Table Name¶
getTableName(
map: CaseInsensitiveStringMap,
paths: Seq[String]): String
getTableName
uses short name and the given paths
to create the following table name (possibly redacting sensitive parts per spark.sql.redaction.string.regex):
[short name] [comma-separated paths]
Paths¶
getPaths(
map: CaseInsensitiveStringMap): Seq[String]
getPaths
concatenates the values of the paths
and path
keys (from the given map
).