DeltaParquetFileFormat¶
DeltaParquetFileFormat
is a ParquetFileFormat
(Spark SQL) to support no restrictions on columns names.
Creating Instance¶
DeltaParquetFileFormat
takes the following to be created:
- DeltaColumnMappingMode
- Reference schema (StructType)
DeltaParquetFileFormat
is created when:
DeltaFileFormat
is requested for the fileFormat
Building Data Reader With Partition Values¶
buildReaderWithPartitionValues(
sparkSession: SparkSession,
dataSchema: StructType,
partitionSchema: StructType,
requiredSchema: StructType,
filters: Seq[Filter],
options: Map[String, String],
hadoopConf: Configuration): PartitionedFile => Iterator[InternalRow]
buildReaderWithPartitionValues
prepares the given schemas (e.g., dataSchema
, partitionSchema
and requiredSchema
) before requesting the parent ParquetFileFormat
to buildReaderWithPartitionValues
.
buildReaderWithPartitionValues
is part of the ParquetFileFormat
(Spark SQL) abstraction.
Preparing Schema¶
prepareSchema(
inputSchema: StructType): StructType
prepareSchema
creates a physical schema (for the inputSchema
, the referenceSchema and the DeltaColumnMappingMode).
supportFieldName¶
supportFieldName(
name: String): Boolean
supportFieldName
is enabled (true
) when the columnMappingMode is not NoMapping
or requests the parent ParquetFileFormat
to supportFieldName
.
supportFieldName
is part of the ParquetFileFormat
(Spark SQL) abstraction.