RocksDBFileManager¶
RocksDBFileManager is the file manager of RocksDB.
Creating Instance¶
RocksDBFileManager takes the following to be created:
- DFS Root Directory
- Local Temporary Directory
- Hadoop
Configuration - Logging ID
RocksDBFileManager is created when:
RocksDBis created
saveCheckpointToDfs¶
saveCheckpointToDfs(
checkpointDir: File,
version: Long,
numKeys: Long): Unit
Saves all the files in the given local checkpointDir checkpoint directory as a committed version to DFS.
RocksDB: commit - file sync to external storage time
The duration of saveCheckpointToDfs is tracked and available as RocksDB: commit - file sync to external storage time metric (via fileSync).
saveCheckpointToDfs logs the files in the given checkpointDir directory. saveCheckpointToDfs prints out the following INFO message to the logs:
Saving checkpoint files for version [version] - [num] files
[path] - [length] bytes
saveCheckpointToDfs listRocksDBFiles in the given checkpointDir directory.
saveCheckpointToDfs saveImmutableFilesToDfs.
saveCheckpointToDfs creates a RocksDBCheckpointMetadata.
saveCheckpointToDfs localMetadataFile from the given checkpointDir directory.
saveCheckpointToDfs requests the RocksDBCheckpointMetadata to writeToFile.
saveCheckpointToDfs prints out the following INFO message to the logs:
Written metadata for version [version]:
[metadata]
saveCheckpointToDfs zipToDfsFile.
In the end, saveCheckpointToDfs prints out the following INFO message to the logs:
Saved checkpoint file for version [version]
saveCheckpointToDfs is used when:
RocksDBis requested to commit state changes
Logging¶
Enable ALL logging level for org.apache.spark.sql.execution.streaming.state.RocksDBFileManager logger to see what happens inside.
Add the following line to conf/log4j.properties:
log4j.logger.org.apache.spark.sql.execution.streaming.state.RocksDBFileManager=ALL
Refer to Logging.