OptimisticTransaction

OptimisticTransaction is an OptimisticTransactionImpl (which seems more of a class name change than anything more important).

OptimisticTransaction is created for changes to a delta table at a given version.

When OptimisticTransaction (as a OptimisticTransactionImpl) is attempted to be committed (that does doCommit internally), the LogStore (of the DeltaLog) is requested to write actions to a delta file, e.g. _delta_log/00000000000000000001.json for the attempt version 1. Only when a FileAlreadyExistsException is thrown a commit is considered unsuccessful and retried.

OptimisticTransaction can be associated with a thread as an active transaction.

Creating Instance

OptimisticTransaction takes the following to be created:

The DeltaLog and Snapshot are part of the OptimisticTransactionImpl contract (which in turn inherits them as a TransactionalWrite and changes to val from def).

OptimisticTransaction is created when DeltaLog is used for the following:

Active Thread-Local OptimisticTransaction

active: ThreadLocal[OptimisticTransaction]

active is a Java ThreadLocal with the OptimisticTransaction of the current thread.

ThreadLocal provides thread-local variables. These variables differ from their normal counterparts in that each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable.

ThreadLocal instances are typically private static fields in classes that wish to associate state with a thread (e.g., a user ID or Transaction ID).

active is assigned to the current thread using setActive utility and cleared in clearActive.

active is available using getActive utility.

There can only be one active OptimisticTransaction (or an IllegalStateException is thrown).

Utilities

setActive

setActive(
  txn: OptimisticTransaction): Unit

setActive simply associates the given OptimisticTransaction as active with the current thread.

setActive throws an IllegalStateException if there is an active OptimisticTransaction already associated:

Cannot set a new txn as active when one is already active

setActive is used when DeltaLog is requested to execute an operation in a new transaction.

clearActive

clearActive(): Unit

clearActive simply clears the active transaction (so no transaction is associated with a thread).

clearActive is used when DeltaLog is requested to execute an operation in a new transaction.

getActive

getActive(): Option[OptimisticTransaction]

getActive simply returns the active transaction.

getActive seems unused.

Logging

Enable ALL logging level for org.apache.spark.sql.delta.OptimisticTransaction logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.sql.delta.OptimisticTransaction=ALL

Refer to Logging.

Demo

import org.apache.spark.sql.delta.DeltaLog
val dir = "/tmp/delta/users"
val log = DeltaLog.forTable(spark, dir)

val txn = log.startTransaction()

// ...changes to a delta table...
val addFile = AddFile("foo", Map.empty, 1L, System.currentTimeMillis(), dataChange = true)
val removeFile = addFile.remove
val actions = addFile :: removeFile :: Nil

txn.commit(actions, op)

// You could do the following instead

deltaLog.withNewTransaction { txn =>
  // ...transactional changes to a delta table
}