DataSkippingReaderBase¶
DataSkippingReaderBase
is an extension of the DeltaScanGenerator abstraction for DeltaScan generators.
Contract¶
Dataset of AddFiles¶
allFiles: Dataset[AddFile]
Used when:
DataSkippingReaderBase
is requested to withStatsInternal0, withNoStats, getAllFiles, filterOnPartitions, getSpecificFilesWithStats
DeltaLog¶
deltaLog: DeltaLog
Used when:
DataSkippingReaderBase
is requested to filesForScan
Metadata¶
metadata: Metadata
Used when:
DataSkippingReaderBase
is requested to columnMappingMode, getStatsColumnOpt, filesWithStatsForScan, constructPartitionFilters, filterOnPartitions, filesForScan
numOfFiles¶
numOfFiles: Long
Used when:
DataSkippingReaderBase
is requested to filesForScan
Path¶
path: Path
Redacted Path¶
redactedPath: String
Used when:
DataSkippingReaderBase
is requested to withStatsCache
Schema¶
schema: StructType
Used when:
DataSkippingReaderBase
is requested to filesForScan
sizeInBytes¶
sizeInBytes: Long
Used when:
DataSkippingReaderBase
is requested to filesForScan
version¶
version: Long
Used when:
DataSkippingReaderBase
is requested to withStatsCache, filesForScan
Implementations¶
spark.databricks.delta.stats.skipping¶
useStats: Boolean
useStats
is the value of spark.databricks.delta.stats.skipping configuration property.
useStats
is used when:
DataSkippingReaderBase
is requested to filesForScan
filesForScan¶
filesForScan(
projection: Seq[Attribute],
filters: Seq[Expression]): DeltaScan // (1)!
filesForScan(
projection: Seq[Attribute],
filters: Seq[Expression],
keepNumRecords: Boolean): DeltaScan
keepNumRecords
flag isfalse
filesForScan
...FIXME
filesForScan
is part of the DeltaScanGeneratorBase abstraction.