PreparedDeltaFileIndex¶
PreparedDeltaFileIndex is a TahoeFileIndexWithSnapshotDescriptor that uses DeltaScan for all the work.
Creating Instance¶
PreparedDeltaFileIndex takes the following to be created:
-
SparkSession(Spark SQL) - DeltaLog
- Hadoop Path
- DeltaScan
- Partition schema (StructType)
- Version scanned
PreparedDeltaFileIndex is created when:
PrepareDeltaScanBaselogical optimization rule is executed
DeltaScan¶
PreparedDeltaFileIndex is given a DeltaScan when created.
The DeltaScan is used for all its methods.
Input Files¶
inputFiles: Array[String]
inputFiles...FIXME
inputFiles is part of the FileIndex (Spark SQL) abstraction.
Matching Data Files¶
matchingFiles(
partitionFilters: Seq[Expression],
dataFilters: Seq[Expression]): Seq[AddFile]
matchingFiles...FIXME
matchingFiles is part of the TahoeFileIndex abstraction.
Estimated Size¶
sizeInBytes: Long
sizeInBytes...FIXME
sizeInBytes is part of the FileIndex (Spark SQL) abstraction.