Skip to content

DeltaParquetFileFormat

DeltaParquetFileFormat is a ParquetFileFormat (Spark SQL) to support no restrictions on columns names.

Creating Instance

DeltaParquetFileFormat takes the following to be created:

DeltaParquetFileFormat is created when:

Building Data Reader With Partition Values

buildReaderWithPartitionValues(
  sparkSession: SparkSession,
  dataSchema: StructType,
  partitionSchema: StructType,
  requiredSchema: StructType,
  filters: Seq[Filter],
  options: Map[String, String],
  hadoopConf: Configuration): PartitionedFile => Iterator[InternalRow]

buildReaderWithPartitionValues prepares the given schemas (e.g., dataSchema, partitionSchema and requiredSchema) before requesting the parent ParquetFileFormat to buildReaderWithPartitionValues.

buildReaderWithPartitionValues is part of the ParquetFileFormat (Spark SQL) abstraction.

Preparing Schema

prepareSchema(
  inputSchema: StructType): StructType

prepareSchema creates a physical schema (for the inputSchema, the referenceSchema and the DeltaColumnMappingMode).

supportFieldName

supportFieldName(
  name: String): Boolean

supportFieldName is enabled (true) when the columnMappingMode is not NoMapping or requests the parent ParquetFileFormat to supportFieldName.

supportFieldName is part of the ParquetFileFormat (Spark SQL) abstraction.