FlatMapGroupsWithState Logical Operator¶
FlatMapGroupsWithState is a binary logical operator (Spark SQL) that represents the following high-level operators in (a logical query plan of) a structured query:
Creating Instance¶
FlatMapGroupsWithState takes the following to be created:
- State function (
(Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any]) - Catalyst Expression for keys
- Catalyst Expression for values
- Grouping Attributes
- Data Attributes
- Output Object Attribute
- State
ExpressionEncoder - OutputMode
-
isMapGroupsWithStateflag (default:false) - GroupStateTimeout
-
hasInitialStateflag (default:false) - Initial State Group Attributes
- Initial State Data Attributes
- Initial State Deserializer
-
LogicalPlanof the initial state - Child logical operator
Creating FlatMapGroupsWithState (under SerializeFromObject)¶
apply[K: Encoder, V: Encoder, S: Encoder, U: Encoder](
func: (Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any],
groupingAttributes: Seq[Attribute],
dataAttributes: Seq[Attribute],
outputMode: OutputMode,
isMapGroupsWithState: Boolean,
timeout: GroupStateTimeout,
child: LogicalPlan): LogicalPlan
apply[K: Encoder, V: Encoder, S: Encoder, U: Encoder](
func: (Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any],
groupingAttributes: Seq[Attribute],
dataAttributes: Seq[Attribute],
outputMode: OutputMode,
isMapGroupsWithState: Boolean,
timeout: GroupStateTimeout,
child: LogicalPlan,
initialStateGroupAttrs: Seq[Attribute],
initialStateDataAttrs: Seq[Attribute],
initialState: LogicalPlan): LogicalPlan
apply creates a SerializeFromObject logical operator with a FlatMapGroupsWithState as its child logical operator.
Internally, apply creates SerializeFromObject object consumer (aka unary logical operator) with FlatMapGroupsWithState logical plan.
Internally, apply finds ExpressionEncoder for the type S and creates a FlatMapGroupsWithState with UnresolvedDeserializer for the types K and V.
In the end, apply creates a SerializeFromObject object consumer with the FlatMapGroupsWithState.
Execution Planning¶
FlatMapGroupsWithState is resolved (planned) to:
-
FlatMapGroupsWithStateExec physical operator for streaming queries (by FlatMapGroupsWithStateStrategy execution planning strategy)
-
CoGroupExecorMapGroupsExecphysical operators for batch queries (byBasicOperatorsexecution planning strategy)
ObjectProducer¶
FlatMapGroupsWithState is an ObjectProducer (Spark SQL).