CheckpointFileManager¶
CheckpointFileManager
is an abstraction of checkpoint managers that manage checkpoint files (metadata of streaming batches) on Hadoop DFS-compatible file systems.
CheckpointFileManager
is created based on spark.sql.streaming.checkpointFileManagerClass configuration property if defined before reverting to the available checkpoint managers.
Contract (Subset)¶
createAtomic¶
createAtomic(
path: Path,
overwriteIfPossible: Boolean): CancellableFSDataOutputStream
Used when:
HDFSMetadataLog
is requested to addNewBatchByStreamStreamMetadata
is requested to write a metadata to a fileHDFSBackedStateStoreProvider
is requested for the deltaFileStream and writeSnapshotFileRocksDBFileManager
is requested for the zipToDfsFileStateSchemaCompatibilityChecker
is requested for the createSchemaFile
createCheckpointDirectory¶
createCheckpointDirectory(): Path
Creates the checkpoint path
Used when:
ResolveWriteToStream
is requested toresolveCheckpointLocation
Implementations¶
Creating CheckpointFileManager¶
create(
path: Path,
hadoopConf: Configuration): CheckpointFileManager
create
uses spark.sql.streaming.checkpointFileManagerClass as the name of the implementation to instantiate.
If undefined, create
creates a FileContextBasedCheckpointFileManager first, and, in case of an UnsupportedFileSystemException
, falls back to a FileSystemBasedCheckpointFileManager.
create
prints out the following WARN message to the logs if UnsupportedFileSystemException
happens:
Could not use FileContext API for managing Structured Streaming checkpoint files at [path].
Using FileSystem API instead for managing log files.
If the implementation of FileSystem.rename() is not atomic, then the correctness and fault-tolerance of your Structured Streaming is not guaranteed.
create
is used when:
HDFSMetadataLog
is requested for the fileManagerStreamExecution
is requested for the fileManagerResolveWriteToStream
is requested toresolveCheckpointLocation
StreamMetadata
is requested to read the metadata from a file and write a metadata to a fileHDFSBackedStateStoreProvider
is requested for the fmRocksDBFileManager
is requested for the fmStateSchemaCompatibilityChecker
is requested for the fm