functions.py¶
functions.py defines pandas_udf for pandas user-defined function.
functions.py is part of pyspark.sql.pandas package.
from pyspark.sql.functions import pandas_udf
pandas_udf¶
pandas_udf(
f=None,
returnType=None,
functionType=None)
pandas_udf creates a pandas user-defined function.
pandas_udf _create_pandas_udf (possibly creating a partial function with functools.partial (Python) when used as a decorator).
Decorator¶
pandas_udf can and usually is used as a Python decorator with two positional arguments for the return and function types.
@pandas_udf(returnType, functionType)
returnType¶
returnType can be one of the following:
pyspark.sql.types.DataType- A DDL-formatted type string
functionType¶
functionType must be one the values from PandasUDFType:
- SQL_SCALAR_PANDAS_UDF
- SQL_SCALAR_PANDAS_ITER_UDF
- SQL_GROUPED_MAP_PANDAS_UDF
- SQL_GROUPED_AGG_PANDAS_UDF
- SQL_MAP_PANDAS_ITER_UDF
- SQL_COGROUPED_MAP_PANDAS_UDF
_create_pandas_udf¶
_create_pandas_udf(
f,
returnType,
evalType)
_create_pandas_udf...FIXME