Skip to content

OffsetSeqLog

OffsetSeqLog is a HDFSMetadataLog of OffsetSeq.

OffsetSeqLog is used as the write-ahead log (WAL) of offsets of the streaming query execution engines.

Creating Instance

OffsetSeqLog takes the following to be created:

  • SparkSession (Spark SQL)
  • Metadata log directory

OffsetSeqLog is created when:

get

get(
  batchId: Long): Option[OffsetSeq]

get is part of the HDFSMetadataLog abstraction.


get looks up the batchId in the cachedMetadata, if available, or uses the parent HDFSMetadataLog to find it.

add

add(
  batchId: Long,
  metadata: OffsetSeq): Boolean

add is part of the HDFSMetadataLog abstraction.


add requests the parent HDFSMetadataLog to add the given batchId and metadata.

When successful, add adds it to the cachedMetadata and removes all previous batch metadata but the last two.