SinkWrite Flow Execution¶
SinkWrite is a StreamingFlowExecution that writes a streaming DataFrame to a Sink.
SinkWrite represents a StreamingFlow with a Sink as the output destination at execution.
When executed, SinkWrite starts a streaming query to append new rows to an output table.
Creating Instance¶
SinkWrite takes the following to be created:
- TableIdentifier
- ResolvedFlow
- DataflowGraph
- PipelineUpdateContext
- Checkpoint Location
- Streaming Trigger
- Destination (Sink)
- SQL Configuration
SinkWrite is created when:
FlowPlanneris requested to plan a ResolvedFlow
Start Streaming Query¶
StreamingFlowExecution
startStream is part of the StreamingFlowExecution abstraction.
startStream builds the logical query plan of this flow's structured query (requesting the DataflowGraph to reanalyze this flow).
startStream creates a DataStreamWriter (Spark Structured Streaming) with the following:
DataStreamWriter's Property | Value |
|---|---|
queryName | This displayName |
checkpointLocation option | This checkpoint path |
trigger | This streaming trigger |
outputMode | Append (always) |
format | The format of this output sink |
options | The options of this output sink |
In the end, startStream starts the streaming write query to this output table.