CommitLog — HDFSMetadataLog for Offset Commit Log¶
CommitLog is an HDFSMetadataLog with CommitMetadata metadata.
CommitLog is the offset commit log of streaming query execution engines.
[[CommitMetadata]][[nextBatchWatermarkMs]] CommitLog uses CommitMetadata for the metadata with nextBatchWatermarkMs attribute (of type Long and the default 0).
CommitLog <
$ ls -tr [checkpoint-directory]/commits
0 1 2 3 4 5 6 7 8 9
$ cat [checkpoint-directory]/commits/8
v1
{"nextBatchWatermarkMs": 0}
[[VERSION]] CommitLog uses 1 for the version.
[[creating-instance]] CommitLog (like the parent HDFSMetadataLog) takes the following to be created:
- [[sparkSession]]
SparkSession - [[path]] Path of the metadata log directory
=== [[serialize]] Serializing Metadata (Writing Metadata to Persistent Storage) -- serialize Method
[source, scala]¶
serialize( metadata: CommitMetadata, out: OutputStream): Unit
serialize writes out the <v on a single line (e.g. v1) followed by the given CommitMetadata in JSON format.
serialize is part of HDFSMetadataLog abstraction.
=== [[deserialize]] Deserializing Metadata -- deserialize Method
[source, scala]¶
deserialize(in: InputStream): CommitMetadata¶
deserialize simply reads (deserializes) two lines from the given InputStream for version and the <
deserialize is part of HDFSMetadataLog abstraction.
=== [[add-batchId]] add Method
[source, scala]¶
add(batchId: Long): Unit¶
add...FIXME
NOTE: add is used when...FIXME
=== [[add-batchId-metadata]] add Method
[source, scala]¶
add(batchId: Long, metadata: String): Boolean¶
add...FIXME
add is part of MetadataLog abstraction.