Skip to content

PythonUDF

PythonUDF is a Catalyst expression (Spark SQL).

Creating Instance

PythonUDF takes the following to be created:

PythonUDF is created when:

  • UserDefinedPythonFunction is requested to builder

Unevaluable

PythonUDF is an Unevaluable expression (Spark SQL).

NonSQLExpression

PythonUDF is a NonSQLExpression expression (Spark SQL).

UserDefinedExpression

PythonUDF is a UserDefinedExpression expression (Spark SQL).

isScalarPythonUDF

isScalarPythonUDF(
  e: Expression): Boolean

isScalarPythonUDF holds true when the following all hold true:


isScalarPythonUDF is used when:

  • ExtractPythonUDFFromJoinCondition is requested to hasUnevaluablePythonUDF
  • ExtractPythonUDFFromAggregate is requested to hasPythonUdfOverAggregate
  • ExtractGroupingPythonUDFFromAggregate is requested to hasScalarPythonUDF
  • ExtractPythonUDFs is requested to hasScalarPythonUDF, collectEvaluableUDFs, extract

Scalar PythonUDF Types

PythonUDF is scalar for the following eval types:

isGroupedAggPandasUDF

isGroupedAggPandasUDF(
  e: Expression): Boolean

isGroupedAggPandasUDF is true when the given Expression is a PythonUDF with SQL_GROUPED_AGG_PANDAS_UDF eval type. Otherwise, isGroupedAggPandasUDF is false.