AutoCompactBase¶
AutoCompactBase is an extension of the PostCommitHook abstraction for post-commit hooks that perform auto compaction.
Implementations¶
Name¶
name is Auto Compact.
Executing Post-Commit Hook¶
PostCommitHook
run(
spark: SparkSession,
txn: OptimisticTransactionImpl,
committedVersion: Long,
postCommitSnapshot: Snapshot,
actions: Seq[Action]): Unit
run is part of the PostCommitHook abstraction.
run determines whether Auto Compaction is enabled or not.
run does nothing and returns (and hence skips auto compacting) when shouldSkipAutoCompact is enabled.
In the end, run compactIfNecessary with the following:
delta.commit.hooks.autoOptimizeoperation namemaxDeletedRowsRatiounspecified (None)
Compacting If Necessary¶
compactIfNecessary(
spark: SparkSession,
txn: OptimisticTransactionImpl,
postCommitSnapshot: Snapshot,
opType: String,
maxDeletedRowsRatio: Option[Double]): Seq[OptimizeMetrics]
maxDeletedRowsRatio always undefined (None)
compactIfNecessary prepares an AutoCompactRequest to determine whether to perform auto compaction or not (based on shouldCompact flag of the AutoCompactRequest).
With shouldCompact flag enabled, compactIfNecessary performs auto compaction. Otherwise, compactIfNecessary returns no OptimizeMetrics.
getAutoCompactType¶
getAutoCompactType(
conf: SQLConf,
metadata: Metadata): Option[AutoCompactType]
Return Type
Option[AutoCompactType] is the return type but it's a fancy way to say "enabled" or "not".
When getAutoCompactType returns Some[AutoCompactType] it means "enabled" while None is "disabled".
getAutoCompactType is enabled when either is true (in the order of precedence):
- spark.databricks.delta.autoCompact.enabled
- (deprecated) delta.autoOptimize table property
- delta.autoOptimize.autoCompact table property
getAutoCompactType defaults to false (disabled).
shouldSkipAutoCompact¶
shouldSkipAutoCompact(
autoCompactTypeOpt: Option[AutoCompactType],
spark: SparkSession,
txn: OptimisticTransactionImpl): Boolean
shouldSkipAutoCompact is enabled (true) for the following:
- The given
autoCompactTypeOptis empty (None) - isQualifiedForAutoCompact is disabled
Executing Auto Compaction¶
compact(
spark: SparkSession,
deltaLog: DeltaLog,
catalogTable: Option[CatalogTable],
partitionPredicates: Seq[Expression] = Nil,
opType: String = OP_TYPE,
maxDeletedRowsRatio: Option[Double] = None): Seq[OptimizeMetrics]
compact starts a transaction on the delta table and performs optimization.
compact requests the given DeltaLog to start a transaction.
compact creates a DeltaOptimizeContext with the value of the following configuration properties:
compact requests a new OptimizeExecutor (with no zOrderByColumns and the isAutoCompact flag enabled) to optimize.
Note
The delta table to run optimize on is passed indirectly, as the DeltaLog via the OptimisticTransaction.
In the end, compact returns the OptimizeMetrics (from the optimize stats).