UDFRegistration¶
UDFRegistration is a facade to a session-scoped FunctionRegistry to register user-defined functions (UDFs) and user-defined aggregate functions (UDAFs).
Creating Instance¶
UDFRegistration takes the following to be created:
UDFRegistration is created when:
BaseSessionStateBuilderis requested for the UDFRegistration
Accessing UDFRegistration¶
UDFRegistration is available as SparkSession.udf.
import org.apache.spark.sql.UDFRegistration
assert(spark.udf.isInstanceOf[UDFRegistration])
SessionState¶
UDFRegistration is used to create a SessionState.
Registering UserDefinedFunction¶
register(
name: String,
udf: UserDefinedFunction): UserDefinedFunction
register associates the given name with the given UserDefinedFunction.
register requests the FunctionRegistry to createOrReplaceTempFunction under the given name and with scala_udf source name and a function builder based on the type of the UserDefinedFunction:
- For UserDefinedAggregators, the function builder requests the
UserDefinedAggregatorfor a ScalaAggregator - For all other types, the function builder requests the
UserDefinedFunctionfor a Column and takes the Expression
Registering User-Defined Python Function¶
registerPython(
name: String,
udf: UserDefinedPythonFunction): Unit
registerPython prints out the following DEBUG message to the logs:
Registering new PythonUDF:
name: [name]
command: [command]
envVars: [envVars]
pythonIncludes: [pythonIncludes]
pythonExec: [pythonExec]
dataType: [dataType]
pythonEvalType: [pythonEvalType]
udfDeterministic: [udfDeterministic]
In the end, requests the FunctionRegistry to createOrReplaceTempFunction (under the given name, the builder factory and python_udf source name).
registerPython is used when:
UDFRegistration(PySpark) is requested toregister
Logging¶
Enable ALL logging level for org.apache.spark.sql.UDFRegistration logger to see what happens inside.
Add the following line to conf/log4j2.properties:
logger.UDFRegistration.name = org.apache.spark.sql.UDFRegistration
logger.UDFRegistration.level = all
Refer to Logging.