Skip to content

FlowSystemMetadata

FlowSystemMetadata is a SystemMetadata associated with a Flow.

Creating Instance

FlowSystemMetadata takes the following to be created:

FlowSystemMetadata is created when:

Latest Checkpoint Location

latestCheckpointLocation: String

latestCheckpointLocation determines the path of the latest checkpoint directory under the checkpoints root directory.


latestCheckpointLocation is used when:

Checkpoint Directory

flowCheckpointsDir(): Path

flowCheckpointsDir returns the checkpoint directory for this Flow (under the storage root).


flowCheckpointsDir asserts that the destination table of this Flow is either a table or sink.

flowCheckpointsDir builds a path that is the storageRoot (of this PipelineUpdateContext) followed by _checkpoints (hardcoded).

Checkpoint Root Directory

Checkpoint Root Directory of a SDP project is always the storageRoot followed by _checkpoints.

flowCheckpointsDir creates a path of the checkpoint directory based on the following:

  1. A path for the table identifier (of the destination table) with the dots replaced by path separators.
  2. The "table" part of the TableIdentifier of this Flow.

In the end, flowCheckpointsDir prints out the following INFO message to the logs:

Flow [flowName] using checkpoint directory: [checkpointDir]
IllegalArgumentException

throws an IllegalArgumentException when the destination table of this Flow is neither a table nor sink.


flowCheckpointsDir is used when:

  • FIXME

flowCheckpointsDirOpt

flowCheckpointsDirOpt(): Option[Path]
SPARK-56325 Refactor FlowSystemMetadata.flowCheckpointsDirOpt

It should soon be gone.