CheckpointFileManager¶
CheckpointFileManager is an abstraction of checkpoint managers that manage checkpoint files (metadata of streaming batches) on Hadoop DFS-compatible file systems.
CheckpointFileManager is created based on spark.sql.streaming.checkpointFileManagerClass configuration property if defined before reverting to the available checkpoint managers.
Contract (Subset)¶
createAtomic¶
createAtomic(
path: Path,
overwriteIfPossible: Boolean): CancellableFSDataOutputStream
Used when:
HDFSMetadataLogis requested to addNewBatchByStreamStreamMetadatais requested to write a metadata to a fileHDFSBackedStateStoreProvideris requested for the deltaFileStream and writeSnapshotFileRocksDBFileManageris requested for the zipToDfsFileStateSchemaCompatibilityCheckeris requested for the createSchemaFile
createCheckpointDirectory¶
createCheckpointDirectory(): Path
Creates the checkpoint path
Used when:
ResolveWriteToStreamis requested toresolveCheckpointLocation
Implementations¶
Creating CheckpointFileManager¶
create(
path: Path,
hadoopConf: Configuration): CheckpointFileManager
create uses spark.sql.streaming.checkpointFileManagerClass as the name of the implementation to instantiate.
If undefined, create creates a FileContextBasedCheckpointFileManager first, and, in case of an UnsupportedFileSystemException, falls back to a FileSystemBasedCheckpointFileManager.
create prints out the following WARN message to the logs if UnsupportedFileSystemException happens:
Could not use FileContext API for managing Structured Streaming checkpoint files at [path].
Using FileSystem API instead for managing log files.
If the implementation of FileSystem.rename() is not atomic, then the correctness and fault-tolerance of your Structured Streaming is not guaranteed.
create is used when:
HDFSMetadataLogis requested for the fileManagerStreamExecutionis requested for the fileManagerResolveWriteToStreamis requested toresolveCheckpointLocationStreamMetadatais requested to read the metadata from a file and write a metadata to a fileHDFSBackedStateStoreProvideris requested for the fmRocksDBFileManageris requested for the fmStateSchemaCompatibilityCheckeris requested for the fm