Skip to content

StreamingQuery

StreamingQuery is an abstraction of handles to streaming queries (that are executed continuously and concurrently on a separate thread).

Creating StreamingQuery

StreamingQuery is created when a streaming query is started using DataStreamWriter.start operator.

Demo: Deep Dive into FileStreamSink

Learn more in Demo: Deep Dive into FileStreamSink.

Managing Active StreamingQueries

StreamingQueryManager manages active StreamingQuery instances and allows to access one (by id) or all active queries (using StreamingQueryManager.get or StreamingQueryManager.active operators, respectively).

States

StreamingQuery can be in two states:

  • Active (started)
  • Inactive (stopped)

If inactive, StreamingQuery may have stopped due to an StreamingQueryException.

Implementations

Contract

awaitTermination

awaitTermination(): Unit
awaitTermination(
  timeoutMs: Long): Boolean

Used when...FIXME

StreamingQueryException

exception: Option[StreamingQueryException]

StreamingQueryException if the streaming query has finished due to an exception

Used when...FIXME

Explaining Streaming Query

explain(): Unit
explain(
  extended: Boolean): Unit

Used when...FIXME

Id

id: UUID

Unique identifier of the streaming query (that does not change across restarts unlike runId)

Used when...FIXME

isActive

isActive: Boolean

Indicates whether the streaming query is active (true) or not (false)

Used when...FIXME

StreamingQueryProgress

lastProgress: StreamingQueryProgress

The latest StreamingQueryProgress of the streaming query

Used when...FIXME

Query Name

name: String

Name of the streaming query (unique across all active queries in SparkSession)

Used when...FIXME

Processing All Available Data

processAllAvailable(): Unit

Pauses (blocks) the current thread until the streaming query has no more data to be processed or has been stopped.

Intended for testing

Used when...FIXME

Recent StreamingQueryProgresses

recentProgress: Array[StreamingQueryProgress]

Recent StreamingQueryProgress updates.

Used when...FIXME

Run Id

runId: UUID

Unique identifier of the current execution of the streaming query (that is different every restart unlike id)

Used when...FIXME

SparkSession

sparkSession: SparkSession

Used when...FIXME

StreamingQueryStatus

status: StreamingQueryStatus

StreamingQueryStatus of the streaming query (as StreamExecution has accumulated being a ProgressReporter while running the streaming query)

Used when...FIXME

Stopping Streaming Query

stop(): Unit

Stops the streaming query

Used when...FIXME