DeletionVectorBitmapGenerator¶
buildRowIndexSetsForFilesMatchingCondition¶
buildRowIndexSetsForFilesMatchingCondition(
sparkSession: SparkSession,
txn: OptimisticTransaction,
tableHasDVs: Boolean,
targetDf: DataFrame,
candidateFiles: Seq[AddFile],
condition: Expression,
fileNameColumnOpt: Option[Column] = None,
rowIndexColumnOpt: Option[Column] = None): Seq[DeletionVectorResult]
buildRowIndexSetsForFilesMatchingCondition adds the following columns to the input targetDf DataFrame:
| Column Name | Column |
|---|---|
| filePath | The given fileNameColumnOpt if specified or _metadata.file_path |
| rowIndexCol | The given rowIndexColumnOpt if specified or one of the following based on spark.databricks.delta.deletionVectors.useMetadataRowIndex:
|
| deletionVectorId |
|
In the end, buildRowIndexSetsForFilesMatchingCondition builds the deletion vectors (for the modified targetDf DataFrame).
buildRowIndexSetsForFilesMatchingCondition is used when:
DMLWithDeletionVectorsHelperis requested to findTouchedFiles
Building Deletion Vectors¶
buildDeletionVectors(
spark: SparkSession,
target: DataFrame,
targetDeltaLog: DeltaLog,
deltaTxn: OptimisticTransaction): Seq[DeletionVectorResult]
buildDeletionVectors creates a new DeletionVectorSet to build the deletion vectors.