functions.py¶
functions.py
defines pandas_udf for pandas user-defined function.
functions.py
is part of pyspark.sql.pandas
package.
from pyspark.sql.functions import pandas_udf
pandas_udf¶
pandas_udf(
f=None,
returnType=None,
functionType=None)
pandas_udf
creates a pandas user-defined function.
pandas_udf
_create_pandas_udf (possibly creating a partial function with functools.partial
(Python) when used as a decorator).
Decorator¶
pandas_udf
can and usually is used as a Python decorator with two positional arguments for the return and function types.
@pandas_udf(returnType, functionType)
returnType¶
returnType
can be one of the following:
pyspark.sql.types.DataType
- A DDL-formatted type string
functionType¶
functionType
must be one the values from PandasUDFType
:
- SQL_SCALAR_PANDAS_UDF
- SQL_SCALAR_PANDAS_ITER_UDF
- SQL_GROUPED_MAP_PANDAS_UDF
- SQL_GROUPED_AGG_PANDAS_UDF
- SQL_MAP_PANDAS_ITER_UDF
- SQL_COGROUPED_MAP_PANDAS_UDF
_create_pandas_udf¶
_create_pandas_udf(
f,
returnType,
evalType)
_create_pandas_udf
...FIXME