StreamingQuery¶
StreamingQuery
is an abstraction of handles to streaming queries (that are executed continuously and concurrently on a separate thread).
Creating StreamingQuery¶
StreamingQuery
is created when a streaming query is started using DataStreamWriter.start operator.
Demo: Deep Dive into FileStreamSink
Learn more in Demo: Deep Dive into FileStreamSink.
Managing Active StreamingQueries¶
StreamingQueryManager manages active StreamingQuery
instances and allows to access one (by id) or all active queries (using StreamingQueryManager.get or StreamingQueryManager.active operators, respectively).
States¶
StreamingQuery
can be in two states:
- Active (started)
- Inactive (stopped)
If inactive, StreamingQuery
may have stopped due to an StreamingQueryException.
Implementations¶
Contract¶
awaitTermination¶
awaitTermination(): Unit
awaitTermination(
timeoutMs: Long): Boolean
Used when...FIXME
StreamingQueryException¶
exception: Option[StreamingQueryException]
StreamingQueryException
if the streaming query has finished due to an exception
Used when...FIXME
Explaining Streaming Query¶
explain(): Unit
explain(
extended: Boolean): Unit
Used when...FIXME
Id¶
id: UUID
Unique identifier of the streaming query (that does not change across restarts unlike runId)
Used when...FIXME
isActive¶
isActive: Boolean
Indicates whether the streaming query is active (true
) or not (false
)
Used when...FIXME
StreamingQueryProgress¶
lastProgress: StreamingQueryProgress
The latest StreamingQueryProgress of the streaming query
Used when...FIXME
Query Name¶
name: String
Name of the streaming query (unique across all active queries in SparkSession)
Used when...FIXME
Processing All Available Data¶
processAllAvailable(): Unit
Pauses (blocks) the current thread until the streaming query has no more data to be processed or has been stopped.
Intended for testing
Used when...FIXME
Recent StreamingQueryProgresses¶
recentProgress: Array[StreamingQueryProgress]
Recent StreamingQueryProgress updates.
Used when...FIXME
Run Id¶
runId: UUID
Unique identifier of the current execution of the streaming query (that is different every restart unlike id)
Used when...FIXME
SparkSession¶
sparkSession: SparkSession
Used when...FIXME
StreamingQueryStatus¶
status: StreamingQueryStatus
StreamingQueryStatus of the streaming query (as StreamExecution
has accumulated being a ProgressReporter
while running the streaming query)
Used when...FIXME
Stopping Streaming Query¶
stop(): Unit
Stops the streaming query
Used when...FIXME