DeletionVectorBitmapGenerator¶
buildRowIndexSetsForFilesMatchingCondition¶
buildRowIndexSetsForFilesMatchingCondition(
sparkSession: SparkSession,
txn: OptimisticTransaction,
tableHasDVs: Boolean,
targetDf: DataFrame,
candidateFiles: Seq[AddFile],
condition: Expression,
fileNameColumnOpt: Option[Column] = None,
rowIndexColumnOpt: Option[Column] = None): Seq[DeletionVectorResult]
buildRowIndexSetsForFilesMatchingCondition
adds the following columns to the input targetDf
DataFrame:
Column Name | Column |
---|---|
filePath | The given fileNameColumnOpt if specified or _metadata.file_path |
rowIndexCol | The given rowIndexColumnOpt if specified or one of the following based on spark.databricks.delta.deletionVectors.useMetadataRowIndex:
|
deletionVectorId |
|
In the end, buildRowIndexSetsForFilesMatchingCondition
builds the deletion vectors (for the modified targetDf
DataFrame).
buildRowIndexSetsForFilesMatchingCondition
is used when:
DMLWithDeletionVectorsHelper
is requested to findTouchedFiles
Building Deletion Vectors¶
buildDeletionVectors(
spark: SparkSession,
target: DataFrame,
targetDeltaLog: DeltaLog,
deltaTxn: OptimisticTransaction): Seq[DeletionVectorResult]
buildDeletionVectors
creates a new DeletionVectorSet to build the deletion vectors.