FlatMapGroupsWithState Logical Operator¶
FlatMapGroupsWithState
is a binary logical operator that represents the following KeyValueGroupedDataset high-level operators:
BinaryNode¶
FlatMapGroupsWithState
is a binary logical operator with two child operators:
- The child as the left operator
- The initialState as the right operator
ObjectProducer¶
FlatMapGroupsWithState
is an ObjectProducer
.
Creating Instance¶
FlatMapGroupsWithState
takes the following to be created:
- State Mapping Function (
(Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any]
) - Key Deserializer Expression
- Value Deserializer Expression
- Grouping Attributes
- Data Attributes
- Output Object Attribute
- State ExpressionEncoder
-
OutputMode
-
isMapGroupsWithState
flag (default:false
) -
GroupStateTimeout
-
hasInitialState
flag (default:false
) - Initial State Group Attributes
- Initial State Data Attributes
- Initial State Deserializer Expression
- Initial State LogicalPlan
- Child LogicalPlan
FlatMapGroupsWithState
is created using apply factory.
Creating FlatMapGroupsWithState¶
apply[K: Encoder, V: Encoder, S: Encoder, U: Encoder](
func: (Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any],
groupingAttributes: Seq[Attribute],
dataAttributes: Seq[Attribute],
outputMode: OutputMode,
isMapGroupsWithState: Boolean,
timeout: GroupStateTimeout,
child: LogicalPlan): LogicalPlan
apply[K: Encoder, V: Encoder, S: Encoder, U: Encoder](
func: (Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any],
groupingAttributes: Seq[Attribute],
dataAttributes: Seq[Attribute],
outputMode: OutputMode,
isMapGroupsWithState: Boolean,
timeout: GroupStateTimeout,
child: LogicalPlan,
initialStateGroupAttrs: Seq[Attribute],
initialStateDataAttrs: Seq[Attribute],
initialState: LogicalPlan): LogicalPlan
apply
creates a FlatMapGroupsWithState with the following:
UnresolvedDeserializer
s for the keys, values and the initial state- Generates the output object attribute
In the end, apply
creates a SerializeFromObject unary logical operator with the FlatMapGroupsWithState
operator as the child.
apply
is used for the following high-level operators:
Execution Planning¶
FlatMapGroupsWithState
is planed as follows:
Physical Operator | Execution Planning Strategy |
---|---|
FlatMapGroupsWithStateExec (Spark Structured Streaming) | FlatMapGroupsWithStateStrategy (Spark Structured Streaming) |
CoGroupExec or MapGroupsExec | BasicOperators |