AutoCompactBase¶
AutoCompactBase
is an extension of the PostCommitHook abstraction for post-commit hooks that perform auto compaction.
Implementations¶
Name¶
name
is Auto Compact.
Executing Post-Commit Hook¶
PostCommitHook
run(
spark: SparkSession,
txn: OptimisticTransactionImpl,
committedVersion: Long,
postCommitSnapshot: Snapshot,
actions: Seq[Action]): Unit
run
is part of the PostCommitHook abstraction.
run
determines whether Auto Compaction is enabled or not.
run
does nothing and returns (and hence skips auto compacting) when shouldSkipAutoCompact is enabled.
In the end, run
compactIfNecessary with the following:
delta.commit.hooks.autoOptimize
operation namemaxDeletedRowsRatio
unspecified (None
)
Compacting If Necessary¶
compactIfNecessary(
spark: SparkSession,
txn: OptimisticTransactionImpl,
postCommitSnapshot: Snapshot,
opType: String,
maxDeletedRowsRatio: Option[Double]): Seq[OptimizeMetrics]
maxDeletedRowsRatio
always undefined (None
)
compactIfNecessary
prepares an AutoCompactRequest to determine whether to perform auto compaction or not (based on shouldCompact flag of the AutoCompactRequest).
With shouldCompact flag enabled, compactIfNecessary
performs auto compaction. Otherwise, compactIfNecessary
returns no OptimizeMetrics.
getAutoCompactType¶
getAutoCompactType(
conf: SQLConf,
metadata: Metadata): Option[AutoCompactType]
Return Type
Option[AutoCompactType]
is the return type but it's a fancy way to say "enabled" or "not".
When getAutoCompactType
returns Some[AutoCompactType]
it means "enabled" while None
is "disabled".
getAutoCompactType
is enabled when either is true
(in the order of precedence):
- spark.databricks.delta.autoCompact.enabled
- (deprecated) delta.autoOptimize table property
- delta.autoOptimize.autoCompact table property
getAutoCompactType
defaults to false
(disabled).
shouldSkipAutoCompact¶
shouldSkipAutoCompact(
autoCompactTypeOpt: Option[AutoCompactType],
spark: SparkSession,
txn: OptimisticTransactionImpl): Boolean
shouldSkipAutoCompact
is enabled (true
) for the following:
- The given
autoCompactTypeOpt
is empty (None
) - isQualifiedForAutoCompact is disabled
Executing Auto Compaction¶
compact(
spark: SparkSession,
deltaLog: DeltaLog,
catalogTable: Option[CatalogTable],
partitionPredicates: Seq[Expression] = Nil,
opType: String = OP_TYPE,
maxDeletedRowsRatio: Option[Double] = None): Seq[OptimizeMetrics]
compact
starts a transaction on the delta table and performs optimization.
compact
requests the given DeltaLog to start a transaction.
compact
creates a DeltaOptimizeContext with the value of the following configuration properties:
compact
requests a new OptimizeExecutor (with no zOrderByColumns and the isAutoCompact flag enabled) to optimize.
Note
The delta table to run optimize on is passed indirectly, as the DeltaLog via the OptimisticTransaction.
In the end, compact
returns the OptimizeMetrics (from the optimize stats).