CdcAddFileIndex¶
CdcAddFileIndex is a TahoeBatchFileIndex with the following:
| Property | Value |
|---|---|
| Action Type | cdcRead |
| addFiles | The AddFiles of the given CDCDataSpecs |
CdcAddFileIndex is used by CDCReaderImpl to scanIndex.
Creating Instance¶
CdcAddFileIndex takes the following to be created:
-
SparkSession - AddFiles by Version (
Seq[CDCDataSpec[AddFile]]) - DeltaLog
-
Path - SnapshotDescriptor
- Row Index Filters
CdcAddFileIndex is created when:
CDCReaderImplis requested for the DataFrame with deleted and added rows and to processDeletionVectorActions
Row Index Filters¶
SupportsRowIndexFilters
rowIndexFilters is part of the SupportsRowIndexFilters abstraction.
CdcAddFileIndex is given Row Index Filters when created.
Input Files¶
inputFiles...FIXME
Matching Files¶
TahoeFileIndex
matchingFiles is part of the TahoeFileIndex abstraction.
matchingFiles...FIXME
Partitions¶
FileIndex
partitionSchema is part of the FileIndex (Spark SQL) abstraction.
partitionSchema cdcReadSchema for the partitions of (the Metadata of) the given SnapshotDescriptor.