udf.py¶
udf module (in pyspark.sql package) defines UDFRegistration.
from pyspark.sql.udf import *
__all__¶
import *
The import statement uses the following convention: if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered.
Learn more in 6.4.1. Importing * From a Package.
_create_udf¶
_create_udf(
f: Callable[..., Any],
returnType: "DataTypeOrString",
evalType: int,
name: Optional[str] = None,
deterministic: bool = True) -> "UserDefinedFunctionLike"
_create_udf creates a UserDefinedFunction (with the name of the object to be the name of function f).
_create_udf is used when:
UDFRegistrationis requested to register- udf is used (and _create_py_udf is executed)
- pandas_udf (from
pyspark.sql.pandas) is executed
_create_py_udf¶
_create_py_udf(
f: Callable[..., Any],
returnType: "DataTypeOrString",
evalType: int,
) -> "UserDefinedFunctionLike"
_create_py_udf...FIXME
_create_py_udf is used when:
- udf is executed
Creating SimplePythonFunction for (Pickled) Python Function¶
_wrap_function(
sc: SparkContext,
func: Callable[..., Any],
returnType: "DataTypeOrString") -> JavaObject
_wrap_function creates a command tuple with the given func and returnType.
_wrap_function _prepare_for_python_RDD for the command tuple that builds the input for a SimplePythonFunction:
pickled_commandbyte arrayenvincludesbroadcast_vars
In the end, _wrap_function creates a SimplePythonFunction with the above and the following from the given SparkContext:
_wrap_function is used when:
UserDefinedFunctionis requested to _create_judf