Skip to content

FlatMapGroupsWithState Logical Operator

FlatMapGroupsWithState is a binary logical operator (Spark SQL) that represents the following high-level operators in (a logical query plan of) a structured query:

Creating Instance

FlatMapGroupsWithState takes the following to be created:

  • State function ((Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any])
  • Catalyst Expression for keys
  • Catalyst Expression for values
  • Grouping Attributes
  • Data Attributes
  • Output Object Attribute
  • State ExpressionEncoder
  • OutputMode
  • isMapGroupsWithState flag (default: false)
  • GroupStateTimeout
  • hasInitialState flag (default: false)
  • Initial State Group Attributes
  • Initial State Data Attributes
  • Initial State Deserializer
  • LogicalPlan of the initial state
  • Child logical operator

Creating FlatMapGroupsWithState (under SerializeFromObject)

apply[K: Encoder, V: Encoder, S: Encoder, U: Encoder](
  func: (Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any],
  groupingAttributes: Seq[Attribute],
  dataAttributes: Seq[Attribute],
  outputMode: OutputMode,
  isMapGroupsWithState: Boolean,
  timeout: GroupStateTimeout,
  child: LogicalPlan): LogicalPlan
apply[K: Encoder, V: Encoder, S: Encoder, U: Encoder](
  func: (Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any],
  groupingAttributes: Seq[Attribute],
  dataAttributes: Seq[Attribute],
  outputMode: OutputMode,
  isMapGroupsWithState: Boolean,
  timeout: GroupStateTimeout,
  child: LogicalPlan,
  initialStateGroupAttrs: Seq[Attribute],
  initialStateDataAttrs: Seq[Attribute],
  initialState: LogicalPlan): LogicalPlan

apply creates a SerializeFromObject logical operator with a FlatMapGroupsWithState as its child logical operator.


Internally, apply creates SerializeFromObject object consumer (aka unary logical operator) with FlatMapGroupsWithState logical plan.

Internally, apply finds ExpressionEncoder for the type S and creates a FlatMapGroupsWithState with UnresolvedDeserializer for the types K and V.

In the end, apply creates a SerializeFromObject object consumer with the FlatMapGroupsWithState.

Execution Planning

FlatMapGroupsWithState is resolved (planned) to:

ObjectProducer

FlatMapGroupsWithState is an ObjectProducer (Spark SQL).