Skip to content

UDFRegistration

UDFRegistration is a Python class in pyspark.sql.udf module.

Registering Python UDF

register(
  self,
  name: str,
  f: Union[Callable[..., Any], "UserDefinedFunctionLike"],
  returnType: Optional[Union[pyspark.sql.types.DataType, str]] = None,
) -> "UserDefinedFunctionLike"

register registers a Python function (incl. lambda function) or a user-defined function as a SQL function (under the given name).

Function f Description
A Python function
  • Includes lambda (unnamed) functions
  • Callable[..., Any]
  • The return type is StringType when not specified
  • Always PythonEvalType.SQL_BATCHED_UDF
pyspark.sql.functions.udf
  • row-at-a-time
  • UserDefinedFunctionLike
pyspark.sql.functions.pandas_udf
  • vectorized
  • UserDefinedFunctionLike

evalType of the a user-defined function can be one of the following:


register _create_udf and requests the _jsparkSession for the UDFRegistration (Spark SQL) to registerPython (Spark SQL).

from pyspark.sql.functions import call_udf, col
from pyspark.sql.types import IntegerType, StringType

rows = [(1, "a"),(2, "b"), (3, "c")]
columns = ["id", "name"]
df = spark.createDataFrame(rows, columns)

spark.udf.register("intX2", lambda i: i * 2, IntegerType())
df.select(call_udf("intX2", "id")).show()