Skip to content

StateOperatorProgress

StateOperatorProgress is a progress of (updates made to) all the stateful operators in a micro-batch of a StreamingQuery.

StateOperatorProgress can be read in JSON format using jsonValue.

Creating Instance

StateOperatorProgress takes the following to be created:

  • Operator Name
  • numRowsTotal
  • numRowsUpdated
  • allUpdatesTimeMs
  • numRowsRemoved
  • allRemovalsTimeMs
  • commitTimeMs
  • memoryUsedBytes
  • numRowsDroppedByWatermark
  • numShufflePartitions
  • numStateStoreInstances
  • Custom Metrics

StateOperatorProgress is created when:

  • StateStoreWriter is requested to report progress
  • StateOperatorProgress is requested for a copy

numRowsTotal

numRowsTotal is the value of numTotalStateRows metric of a StateStoreWriter physical operator (when requested to get progress).

numRowsUpdated

numRowsUpdated is the value of numUpdatedStateRows metric of a StateStoreWriter physical operator (when requested to get progress).

numRowsUpdated is updated when SessionWindowStateStoreSaveExec operator is requested for a StateOperatorProgress.

numRowsUpdated is displayed in Structured Streaming UI as Aggregated Number Of Updated State Rows.

Custom Metrics

customMetrics: Map[String, Long]

StateOperatorProgress can be given a collection of custom metrics (of the stateful operator it reports progress of). There are no metrics defined by default.

The custom metrics are stateStoreCustomMetrics and statefulOperatorCustomMetrics

Any custom metric included in spark.sql.streaming.ui.enabledCustomMetricList is displayed in Structured Streaming UI.

Included in jsonValue

copy

copy(
  newNumRowsUpdated: Long,
  newNumRowsDroppedByWatermark: Long): StateOperatorProgress

copy creates a copy of this StateOperatorProgress with the numRowsUpdated and numRowsDroppedByWatermark metrics updated.


copy is used when:

jsonValue

jsonValue: JValue

jsonValue...FIXME


jsonValue is used when: