DeleteCommand¶
DeleteCommand
is a DeltaCommand that represents DeltaDelete logical command at execution.
DeleteCommand
is a LeafRunnableCommand
(Spark SQL) logical operator.
Creating Instance¶
DeleteCommand
takes the following to be created:
- TahoeFileIndex
- Target Data (LogicalPlan)
- Condition (Expression)
DeleteCommand
is created (also using apply factory utility) when:
- PreprocessTableDelete logical resolution rule is executed (and resolves a DeltaDelete logical command)
Performance Metrics¶
Signature
metrics: Map[String, SQLMetric]
metrics
is part of the RunnableCommand
(Spark SQL) abstraction.
metrics
creates the performance metrics.
Executing Command¶
RunnableCommand
run(
sparkSession: SparkSession): Seq[Row]
run
is part of the RunnableCommand
(Spark SQL) abstraction.
run
requests the TahoeFileIndex for the DeltaLog (and asserts that the table is removable).
run
requests the DeltaLog
to start a new transaction for performDelete.
In the end, run
re-caches all cached plans (incl. this relation itself) by requesting the CacheManager
(Spark SQL) to recache the target.
performDelete¶
performDelete(
sparkSession: SparkSession,
deltaLog: DeltaLog,
txn: OptimisticTransaction): Unit
performDelete
is used when:
- DeleteCommand is executed
WriteIntoDelta
is requested to removeFiles
Number of Table Files¶
performDelete
requests the given DeltaLog for the current Snapshot that is in turn requested for the number of files in the delta table.
Finding Delete Actions¶
performDelete
branches off based on the optional condition:
- No condition to delete the whole table
- Condition defined on metadata only
- Other conditions
Delete Condition Undefined¶
performDelete
...FIXME
Metadata-Only Delete Condition¶
performDelete
...FIXME
Other Delete Conditions¶
performDelete
...FIXME
Delete Actions Available¶
performDelete
...FIXME
rewriteFiles¶
rewriteFiles(
txn: OptimisticTransaction,
baseData: DataFrame,
filterCondition: Expression,
numFilesToRewrite: Long): Seq[FileAction]
rewriteFiles
reads the delta.enableChangeDataFeed table property of the delta table (from the Metadata of the given OptimisticTransaction).
rewriteFiles
creates a numTouchedRows
metric and a numTouchedRowsUdf
UDF to count the number of rows that have been touched.
rewriteFiles
creates a DataFrame
to write (with the numTouchedRowsUdf
UDF and the filterCondition
column). The DataFrame
can also include _change_type column (with null
or delete
values based on the filterCondition
).
In the end, rewriteFiles
requests the given OptimisticTransaction to write the DataFrame.
shouldWritePersistentDeletionVectors¶
shouldWritePersistentDeletionVectors(
spark: SparkSession,
txn: OptimisticTransaction): Boolean
shouldWritePersistentDeletionVectors
is enabled (true
) when the following all hold:
- spark.databricks.delta.delete.deletionVectors.persistent configuration property is enabled (
true
) - Protocol and table configuration support deletion vectors feature
Creating DeleteCommand¶
apply(
delete: DeltaDelete): DeleteCommand
apply
creates a DeleteCommand.