CommitLog — HDFSMetadataLog for Offset Commit Log¶
CommitLog
is an HDFSMetadataLog with CommitMetadata metadata.
CommitLog
is the offset commit log of streaming query execution engines.
[[CommitMetadata]][[nextBatchWatermarkMs]] CommitLog
uses CommitMetadata
for the metadata with nextBatchWatermarkMs attribute (of type Long
and the default 0
).
CommitLog
<
$ ls -tr [checkpoint-directory]/commits
0 1 2 3 4 5 6 7 8 9
$ cat [checkpoint-directory]/commits/8
v1
{"nextBatchWatermarkMs": 0}
[[VERSION]] CommitLog
uses 1 for the version.
[[creating-instance]] CommitLog
(like the parent HDFSMetadataLog) takes the following to be created:
- [[sparkSession]]
SparkSession
- [[path]] Path of the metadata log directory
=== [[serialize]] Serializing Metadata (Writing Metadata to Persistent Storage) -- serialize
Method
[source, scala]¶
serialize( metadata: CommitMetadata, out: OutputStream): Unit
serialize
writes out the <v
on a single line (e.g. v1
) followed by the given CommitMetadata
in JSON format.
serialize
is part of HDFSMetadataLog abstraction.
=== [[deserialize]] Deserializing Metadata -- deserialize
Method
[source, scala]¶
deserialize(in: InputStream): CommitMetadata¶
deserialize
simply reads (deserializes) two lines from the given InputStream
for version and the <
deserialize
is part of HDFSMetadataLog abstraction.
=== [[add-batchId]] add
Method
[source, scala]¶
add(batchId: Long): Unit¶
add
...FIXME
NOTE: add
is used when...FIXME
=== [[add-batchId-metadata]] add
Method
[source, scala]¶
add(batchId: Long, metadata: String): Boolean¶
add
...FIXME
add
is part of MetadataLog abstraction.