AutoCompactUtils¶
prepareAutoCompactRequest¶
prepareAutoCompactRequest(
spark: SparkSession,
txn: OptimisticTransactionImpl,
postCommitSnapshot: Snapshot,
partitionsAddedToOpt: Option[PartitionKeySet],
opType: String,
maxDeletedRowsRatio: Option[Double]): AutoCompactRequest
prepareAutoCompactRequest creates an AutoCompactRequest based on reserveTablePartitions and a partition predicate (for the given postCommitSnapshot and the reserved partitions).
prepareAutoCompactRequest is used when:
AutoCompactBaseis requested to compactIfNecessary
createPartitionPredicate¶
createPartitionPredicate(
postCommitSnapshot: Snapshot,
partitions: PartitionKeySet): Seq[Expression]
createPartitionPredicate...FIXME
reserveTablePartitions¶
reserveTablePartitions(
spark: SparkSession,
deltaLog: DeltaLog,
postCommitSnapshot: Snapshot,
partitionsAddedToOpt: Option[PartitionKeySet],
opType: String,
maxDeletedRowsRatio: Option[Double]): (Boolean, PartitionKeySet)
maxDeletedRowsRatio always undefined (None)
maxDeletedRowsRatio is always None as that's what prepareAutoCompactRequest is called with when compacting if necessary.
opType always delta.commit.hooks.autoOptimize
opType is always delta.commit.hooks.autoOptimize.
partitionsAddedToOpt
partitionsAddedToOpt is the set of distinct partitions that contain added files by the current transaction.
Noop when the given partitionsAddedToOpt is empty
reserveTablePartitions does nothing and exits early (noop) when the given partitionsAddedToOpt is empty.
reserveTablePartitions returns (false, Set.empty[PartitionKey]).
reserveTablePartitions finds free partitions to perform auto compaction on based on the two internal flags:
When both enabled, reserveTablePartitions filterFreePartitions. Otherwise, the given partitionsAddedToOpt is used as-is.
reserveTablePartitions does nothing (noop) when there is no free partition. reserveTablePartitions returns (false, Set.empty[PartitionKey]).
reserveTablePartitions choosePartitionsBasedOnMinNumSmallFiles with the free partitions.
With shouldCompactBasedOnNumFiles enabled and no chosenPartitionsBasedOnNumFiles, reserveTablePartitions does nothing more and returns (true, Set.empty[PartitionKey]).
reserveTablePartitions choosePartitionsBasedOnDVs with the free partitions.
reserveTablePartitions...FIXME
choosePartitionsBasedOnMinNumSmallFiles¶
choosePartitionsBasedOnMinNumSmallFiles(
spark: SparkSession,
deltaLog: DeltaLog,
postCommitSnapshot: Snapshot,
freePartitionsAddedTo: PartitionKeySet): ChosenPartitionsResult
choosePartitionsBasedOnMinNumSmallFiles...FIXME
isQualifiedForAutoCompact¶
isQualifiedForAutoCompact(
spark: SparkSession,
txn: OptimisticTransactionImpl): Boolean
isQualifiedForAutoCompact is disabled (false) when there is no transaction commit (i.e., no txnExecutionTimeMs in the given OptimisticTransactionImpl).
isQualifiedForAutoCompact is enabled (true) for isModifiedPartitionsOnlyAutoCompactEnabled disabled.
isQualifiedForAutoCompact is enabled (true) if either holds:
- isNonBlindAppendAutoCompactEnabled is disabled
- The given OptimisticTransactionImpl is not blind-append
isQualifiedForAutoCompact is used when:
AutoCompactBaseis requested to shouldSkipAutoCompact
isNonBlindAppendAutoCompactEnabled¶
isNonBlindAppendAutoCompactEnabled(
spark: SparkSession): Boolean
isNonBlindAppendAutoCompactEnabled is the value of spark.databricks.delta.autoCompact.nonBlindAppend.enabled configuration property (in the given SparkSession).
isModifiedPartitionsOnlyAutoCompactEnabled¶
isModifiedPartitionsOnlyAutoCompactEnabled(
spark: SparkSession): Boolean
isModifiedPartitionsOnlyAutoCompactEnabled says whether Auto Compaction should run on modified partitions only.
isModifiedPartitionsOnlyAutoCompactEnabled is the value of spark.databricks.delta.autoCompact.modifiedPartitionsOnly.enabled configuration property (in the given SparkSession).
isModifiedPartitionsOnlyAutoCompactEnabled is used when:
AutoCompactUtilsis requested to choosePartitionsBasedOnMinNumSmallFiles, isQualifiedForAutoCompact, reserveTablePartitions