RocksDBFileManager¶
RocksDBFileManager
is the file manager of RocksDB.
Creating Instance¶
RocksDBFileManager
takes the following to be created:
- DFS Root Directory
- Local Temporary Directory
- Hadoop
Configuration
- Logging ID
RocksDBFileManager
is created when:
RocksDB
is created
saveCheckpointToDfs¶
saveCheckpointToDfs(
checkpointDir: File,
version: Long,
numKeys: Long): Unit
Saves all the files in the given local checkpointDir
checkpoint directory as a committed version to DFS.
RocksDB: commit - file sync to external storage time
The duration of saveCheckpointToDfs
is tracked and available as RocksDB: commit - file sync to external storage time metric (via fileSync).
saveCheckpointToDfs
logs the files in the given checkpointDir
directory. saveCheckpointToDfs
prints out the following INFO message to the logs:
Saving checkpoint files for version [version] - [num] files
[path] - [length] bytes
saveCheckpointToDfs
listRocksDBFiles in the given checkpointDir
directory.
saveCheckpointToDfs
saveImmutableFilesToDfs.
saveCheckpointToDfs
creates a RocksDBCheckpointMetadata
.
saveCheckpointToDfs
localMetadataFile from the given checkpointDir
directory.
saveCheckpointToDfs
requests the RocksDBCheckpointMetadata
to writeToFile
.
saveCheckpointToDfs
prints out the following INFO message to the logs:
Written metadata for version [version]:
[metadata]
saveCheckpointToDfs
zipToDfsFile.
In the end, saveCheckpointToDfs
prints out the following INFO message to the logs:
Saved checkpoint file for version [version]
saveCheckpointToDfs
is used when:
RocksDB
is requested to commit state changes
Logging¶
Enable ALL
logging level for org.apache.spark.sql.execution.streaming.state.RocksDBFileManager
logger to see what happens inside.
Add the following line to conf/log4j.properties
:
log4j.logger.org.apache.spark.sql.execution.streaming.state.RocksDBFileManager=ALL
Refer to Logging.