TahoeBatchFileIndex is a concrete file index for a given version of a delta table.

TahoeBatchFileIndex is created when:

Creating TahoeBatchFileIndex Instance

TahoeBatchFileIndex takes the following to be created:

TahoeBatchFileIndex initializes the internal properties.

tableVersion Method

tableVersion: Long
tableVersion is part of the TahoeFileIndex contract for the version of the delta table.


matchingFiles Method

  partitionFilters: Seq[Expression],
  dataFilters: Seq[Expression],
  keepStats: Boolean = false): Seq[AddFile]
matchingFiles is part of the TahoeFileIndex Contract for the matching (valid) files by the given filtering expressions.


inputFiles Method

inputFiles: Array[String]
inputFiles is part of the FileIndex contract to…​FIXME


Schema of Partition Columns — partitionSchema Method

partitionSchema: StructType
partitionSchema is part of the FileIndex contract (Spark SQL) to get the schema of the partition columns (if used).

partitionSchema simply requests the Snapshot for the metadata that is in turn requested for the partitionSchema.

sizeInBytes Property

sizeInBytes: Long
sizeInBytes is part of the FileIndex contract (Spark SQL) for the table size (in bytes).

sizeInBytes is simply a sum of the size of all AddFiles.