UnsupportedOperationChecker¶
UnsupportedOperationChecker
checks whether the logical plan of a streaming query uses supported operations only.
UnsupportedOperationChecker
is used when the internal spark.sql.streaming.unsupportedOperationCheck Spark property is enabled.
Note
UnsupportedOperationChecker
comes actually with two methods, i.e. checkForBatch
and <
The Spark Structured Streaming gitbook is solely focused on <
checkForStreaming Method¶
checkForStreaming(
plan: LogicalPlan,
outputMode: OutputMode): Unit
checkForStreaming
asserts that the following requirements hold:
-
<
> -
<
> (on the grouping expressions) -
<
>
checkForStreaming
...FIXME
checkForStreaming
finds all streaming aggregates (i.e. Aggregate
logical operators with streaming sources).
Note
Aggregate
logical operator represents Dataset.groupBy and Dataset.groupByKey operators (and SQL's GROUP BY
clause) in a logical query plan.
[[only-one-streaming-aggregation-allowed]] checkForStreaming
asserts that there is exactly one streaming aggregation in a streaming query.
Otherwise, checkForStreaming
reports a AnalysisException
:
Multiple streaming aggregations are not supported with streaming DataFrames/Datasets
[[streaming-aggregation-append-mode-requires-watermark]] checkForStreaming
asserts that watermark was defined for a streaming aggregation with Append output mode (on at least one of the grouping expressions).
Otherwise, checkForStreaming
reports a AnalysisException
:
Append output mode not supported when there are streaming aggregations on streaming DataFrames/DataSets without watermark
CAUTION: FIXME
checkForStreaming
counts all FlatMapGroupsWithState logical operators (on streaming Datasets with isMapGroupsWithState
flag disabled).
Note
FlatMapGroupsWithState.isMapGroupsWithState flag is disabled when...FIXME
[[multiple-flatMapGroupsWithState]] checkForStreaming
asserts that multiple FlatMapGroupsWithState logical operators are only used when:
-
outputMode
is Append output mode -
outputMode of the
FlatMapGroupsWithState
logical operators is also Append output mode
CAUTION: FIXME Reference to an example in flatMapGroupsWithState
Otherwise, checkForStreaming
reports a AnalysisException
:
Multiple flatMapGroupsWithStates are not supported when they are not all in append mode or the output mode is not append on a streaming DataFrames/Datasets
CAUTION: FIXME
checkForStreaming
is used when StreamingQueryManager
is requested to create a StreamingQueryWrapper (for starting a streaming query), but only when the internal spark.sql.streaming.unsupportedOperationCheck configuration property is enabled.