StreamingTableWrite Flow Execution¶
StreamingTableWrite is a StreamingFlowExecution that represents a StreamingFlow at execution.
When executed, StreamingTableWrite starts a streaming query (a streaming DataFrame (Spark SQL)) to append new rows to a Table destination.
StreamingTableWrite uses DataStreamWriter (Spark Structured Streaming) for writing streaming data with the following properties:
DataStreamWriter's Property | Value |
|---|---|
queryName | This displayName |
format | The format of this output table (if defined) |
checkpointLocation option | This checkpoint path |
outputMode | Always Append (Spark Structured Streaming) |
trigger | Always AvailableNowTrigger (Spark Structured Streaming)(indirectly through this streaming trigger) |
Creating Instance¶
StreamingTableWrite takes the following to be created:
- TableIdentifier
- ResolvedFlow
- DataflowGraph
- PipelineUpdateContext
- Checkpoint Location
- Streaming Trigger
- Destination (Table)
- SQL Configuration
StreamingTableWrite is created when:
FlowPlanneris requested to plan a StreamingFlow
Execute Streaming Query¶
StreamingFlowExecution
startStream(): StreamingQuery
startStream is part of the StreamingFlowExecution abstraction.
startStream builds the logical query plan of this flow's structured query (requesting the DataflowGraph to reanalyze this flow).
startStream creates a DataStreamWriter (Spark Structured Streaming) with the following:
DataStreamWriter's Property | Value |
|---|---|
queryName | This displayName |
checkpointLocation option | This checkpoint path |
trigger | This streaming trigger |
outputMode | Always Append (Spark Structured Streaming) |
format | The format of this output table (if defined) |
In the end, startStream starts the streaming write query to this output table.