Skip to content

functions.py

functions.py defines pandas_udf for pandas user-defined function.

functions.py is part of pyspark.sql.pandas package.

from pyspark.sql.functions import pandas_udf

pandas_udf

pandas_udf(
  f=None,
  returnType=None,
  functionType=None)

pandas_udf creates a pandas user-defined function.

pandas_udf _create_pandas_udf (possibly creating a partial function with functools.partial (Python) when used as a decorator).

Decorator

pandas_udf can and usually is used as a Python decorator with two positional arguments for the return and function types.

@pandas_udf(returnType, functionType)

returnType

returnType can be one of the following:

  • pyspark.sql.types.DataType
  • A DDL-formatted type string

functionType

functionType must be one the values from PandasUDFType:

_create_pandas_udf

_create_pandas_udf(
  f,
  returnType,
  evalType)

_create_pandas_udf...FIXME