SinkWrite Flow Execution¶
SinkWrite is a StreamingFlowExecution that writes a streaming DataFrame to a Sink.
SinkWrite represents a StreamingFlow with a Sink as the output destination at execution.
When executed, SinkWrite starts a streaming query to append new rows to an output table.
Creating Instance¶
SinkWrite takes the following to be created:
- TableIdentifier
- ResolvedFlow
- DataflowGraph
- PipelineUpdateContext
- Checkpoint Location
- Streaming Trigger
- Destination (Sink)
- SQL Configuration
SinkWrite is created when:
FlowPlanneris requested to plan a ResolvedFlow
Start Streaming Query¶
StreamingFlowExecution
startStream(): StreamingQuery
startStream is part of the StreamingFlowExecution abstraction.
startStream builds the logical query plan of this flow's structured query (requesting the DataflowGraph to reanalyze this flow).
startStream creates a DataStreamWriter (Spark Structured Streaming) with the following:
DataStreamWriter's Property | Value |
|---|---|
queryName | This displayName |
checkpointLocation option | This checkpoint path |
trigger | This streaming trigger |
outputMode | Append (always) |
format | The format of this output sink |
options | The options of this output sink |
In the end, startStream starts the streaming write query to this output table.