UpdateCommand¶
UpdateCommand is a DeltaCommand that represents DeltaUpdateTable logical command at execution.
UpdateCommand is a RunnableCommand (Spark SQL) logical operator.
UpdateCommand can use Deletion Vectors table feature to soft-delete records when executed (based on shouldWritePersistentDeletionVectors).
Creating Instance¶
UpdateCommand takes the following to be created:
- TahoeFileIndex
- Target Data (LogicalPlan)
- Update Expressions (Spark SQL)
- (optional) Condition Expression (Spark SQL)
UpdateCommand is created when:
- PreprocessTableUpdate logical resolution rule is executed (and resolves a DeltaUpdateTable logical command)
Performance Metrics¶
| Name | web UI |
|---|---|
numAddedFiles | number of files added. |
numRemovedFiles | number of files removed. |
numUpdatedRows | number of rows updated. |
executionTimeMs | time taken to execute the entire operation |
scanTimeMs | time taken to scan the files for matches |
rewriteTimeMs | time taken to rewrite the matched files |
Executing Command¶
RunnableCommand
run(
sparkSession: SparkSession): Seq[Row]
run is part of the RunnableCommand (Spark SQL) abstraction.
run...FIXME
performUpdate¶
performUpdate(
sparkSession: SparkSession,
deltaLog: DeltaLog,
txn: OptimisticTransaction): Unit
performUpdate...FIXME
With persistent Deletion Vectors enabled, performUpdate...FIXME and findTouchedFiles.
rewriteFiles¶
rewriteFiles(
spark: SparkSession,
txn: OptimisticTransaction,
rootPath: Path,
inputLeafFiles: Seq[String],
nameToAddFileMap: Map[String, AddFile],
condition: Expression): Seq[FileAction]
rewriteFiles...FIXME
buildUpdatedColumns¶
buildUpdatedColumns(
condition: Expression): Seq[Column]
buildUpdatedColumns...FIXME
shouldWritePersistentDeletionVectors¶
shouldWritePersistentDeletionVectors(
spark: SparkSession,
txn: OptimisticTransaction): Boolean
shouldWritePersistentDeletionVectors is enabled (true) when the following all hold:
- spark.databricks.delta.update.deletionVectors.persistent configuration property is enabled (
true) - Protocol and table configuration support deletion vectors feature
shouldWritePersistentDeletionVectors is used when:
UpdateCommandis executed (and performUpdate)