AutoCompactUtils¶
prepareAutoCompactRequest¶
prepareAutoCompactRequest(
spark: SparkSession,
txn: OptimisticTransactionImpl,
postCommitSnapshot: Snapshot,
partitionsAddedToOpt: Option[PartitionKeySet],
opType: String,
maxDeletedRowsRatio: Option[Double]): AutoCompactRequest
prepareAutoCompactRequest
creates an AutoCompactRequest based on reserveTablePartitions and a partition predicate (for the given postCommitSnapshot and the reserved partitions).
prepareAutoCompactRequest
is used when:
AutoCompactBase
is requested to compactIfNecessary
createPartitionPredicate¶
createPartitionPredicate(
postCommitSnapshot: Snapshot,
partitions: PartitionKeySet): Seq[Expression]
createPartitionPredicate
...FIXME
reserveTablePartitions¶
reserveTablePartitions(
spark: SparkSession,
deltaLog: DeltaLog,
postCommitSnapshot: Snapshot,
partitionsAddedToOpt: Option[PartitionKeySet],
opType: String,
maxDeletedRowsRatio: Option[Double]): (Boolean, PartitionKeySet)
maxDeletedRowsRatio
always undefined (None
)
maxDeletedRowsRatio
is always None
as that's what prepareAutoCompactRequest is called with when compacting if necessary.
opType
always delta.commit.hooks.autoOptimize
opType
is always delta.commit.hooks.autoOptimize.
partitionsAddedToOpt
partitionsAddedToOpt
is the set of distinct partitions that contain added files by the current transaction.
Noop when the given partitionsAddedToOpt
is empty
reserveTablePartitions
does nothing and exits early (noop) when the given partitionsAddedToOpt
is empty.
reserveTablePartitions
returns (false, Set.empty[PartitionKey])
.
reserveTablePartitions
finds free partitions to perform auto compaction on based on the two internal flags:
When both enabled, reserveTablePartitions
filterFreePartitions. Otherwise, the given partitionsAddedToOpt
is used as-is.
reserveTablePartitions
does nothing (noop) when there is no free partition. reserveTablePartitions
returns (false, Set.empty[PartitionKey])
.
reserveTablePartitions
choosePartitionsBasedOnMinNumSmallFiles with the free partitions.
With shouldCompactBasedOnNumFiles
enabled and no chosenPartitionsBasedOnNumFiles
, reserveTablePartitions
does nothing more and returns (true, Set.empty[PartitionKey])
.
reserveTablePartitions
choosePartitionsBasedOnDVs with the free partitions.
reserveTablePartitions
...FIXME
choosePartitionsBasedOnMinNumSmallFiles¶
choosePartitionsBasedOnMinNumSmallFiles(
spark: SparkSession,
deltaLog: DeltaLog,
postCommitSnapshot: Snapshot,
freePartitionsAddedTo: PartitionKeySet): ChosenPartitionsResult
choosePartitionsBasedOnMinNumSmallFiles
...FIXME
isQualifiedForAutoCompact¶
isQualifiedForAutoCompact(
spark: SparkSession,
txn: OptimisticTransactionImpl): Boolean
isQualifiedForAutoCompact
is disabled (false
) when there is no transaction commit (i.e., no txnExecutionTimeMs in the given OptimisticTransactionImpl).
isQualifiedForAutoCompact
is enabled (true
) for isModifiedPartitionsOnlyAutoCompactEnabled disabled.
isQualifiedForAutoCompact
is enabled (true
) if either holds:
- isNonBlindAppendAutoCompactEnabled is disabled
- The given OptimisticTransactionImpl is not blind-append
isQualifiedForAutoCompact
is used when:
AutoCompactBase
is requested to shouldSkipAutoCompact
isNonBlindAppendAutoCompactEnabled¶
isNonBlindAppendAutoCompactEnabled(
spark: SparkSession): Boolean
isNonBlindAppendAutoCompactEnabled
is the value of spark.databricks.delta.autoCompact.nonBlindAppend.enabled configuration property (in the given SparkSession
).
isModifiedPartitionsOnlyAutoCompactEnabled¶
isModifiedPartitionsOnlyAutoCompactEnabled(
spark: SparkSession): Boolean
isModifiedPartitionsOnlyAutoCompactEnabled
says whether Auto Compaction should run on modified partitions only.
isModifiedPartitionsOnlyAutoCompactEnabled
is the value of spark.databricks.delta.autoCompact.modifiedPartitionsOnly.enabled configuration property (in the given SparkSession
).
isModifiedPartitionsOnlyAutoCompactEnabled
is used when:
AutoCompactUtils
is requested to choosePartitionsBasedOnMinNumSmallFiles, isQualifiedForAutoCompact, reserveTablePartitions