FileDataSourceV2 Table Providers¶
FileDataSourceV2 is an extension of the TableProvider abstraction for file-based table providers.
Contract¶
fallbackFileFormat¶
fallbackFileFormat: Class[_ <: FileFormat]
A V1 FileFormat class of this file-based data source
See:
Used when:
DDLUtilsis requested tocheckDataColNamesDataSourceis requested for the providingClass (for resolving data source relation for catalog tables)PreprocessTableCreationlogical analysis rule is executed
Table¶
getTable(
options: CaseInsensitiveStringMap): Table
getTable(
options: CaseInsensitiveStringMap,
schema: StructType): Table
getTable(
schema: StructType,
partitioning: Array[Transform],
properties: Map[String, String]): Table // (1)!
- Part of the TableProvider abstraction
A Table of this table provider
See:
Used when:
FileDataSourceV2is requested for a table (as a TableProvider) and inferSchema
Implementations¶
AvroDataSourceV2CSVDataSourceV2JsonDataSourceV2OrcDataSourceV2- ParquetDataSourceV2
TextDataSourceV2
DataSourceRegister¶
FileDataSourceV2 is a DataSourceRegister.
Schema Inference¶
inferSchema(
options: CaseInsensitiveStringMap): StructType
inferSchema is part of the TableProvider abstraction.
inferSchema requests the Table for the schema.
If not available, inferSchema creates a Table and "saves" it for later (in t registry).
Table Name¶
getTableName(
map: CaseInsensitiveStringMap,
paths: Seq[String]): String
getTableName uses short name and the given paths to create the following table name (possibly redacting sensitive parts per spark.sql.redaction.string.regex):
[short name] [comma-separated paths]
Paths¶
getPaths(
map: CaseInsensitiveStringMap): Seq[String]
getPaths concatenates the values of the paths and path keys (from the given map).