Skip to content

AggregateInPandasExec Physical Operator

AggregateInPandasExec is a unary physical operator (Spark SQL).

Creating Instance

AggregateInPandasExec takes the following to be created:

AggregateInPandasExec is created when Aggregation execution planning strategy (Spark SQL) is executed for Aggregate logical operators (Spark SQL) with PythonUDF aggregate expressions only.

Executing Operator

doExecute(): RDD[InternalRow]

doExecute uses ArrowPythonRunner (one per partition) to execute PythonUDFs.

doExecute is part of the SparkPlan (Spark SQL) abstraction.

Last update: 2021-03-03