UDFRegistration¶
UDFRegistration
is a facade to a session-scoped FunctionRegistry to register user-defined functions (UDFs) and user-defined aggregate functions (UDAFs).
Creating Instance¶
UDFRegistration
takes the following to be created:
UDFRegistration
is created when:
BaseSessionStateBuilder
is requested for the UDFRegistration
Accessing UDFRegistration¶
UDFRegistration
is available as SparkSession.udf.
import org.apache.spark.sql.UDFRegistration
assert(spark.udf.isInstanceOf[UDFRegistration])
SessionState¶
UDFRegistration
is used to create a SessionState.
Registering UserDefinedFunction¶
register(
name: String,
udf: UserDefinedFunction): UserDefinedFunction
register
associates the given name with the given UserDefinedFunction.
register
requests the FunctionRegistry to createOrReplaceTempFunction under the given name and with scala_udf
source name and a function builder based on the type of the UserDefinedFunction
:
- For UserDefinedAggregators, the function builder requests the
UserDefinedAggregator
for a ScalaAggregator - For all other types, the function builder requests the
UserDefinedFunction
for a Column and takes the Expression
Registering User-Defined Python Function¶
registerPython(
name: String,
udf: UserDefinedPythonFunction): Unit
registerPython
prints out the following DEBUG message to the logs:
Registering new PythonUDF:
name: [name]
command: [command]
envVars: [envVars]
pythonIncludes: [pythonIncludes]
pythonExec: [pythonExec]
dataType: [dataType]
pythonEvalType: [pythonEvalType]
udfDeterministic: [udfDeterministic]
In the end, requests the FunctionRegistry to createOrReplaceTempFunction (under the given name, the builder factory and python_udf
source name).
registerPython
is used when:
UDFRegistration
(PySpark) is requested toregister
Logging¶
Enable ALL
logging level for org.apache.spark.sql.UDFRegistration
logger to see what happens inside.
Add the following line to conf/log4j2.properties
:
logger.UDFRegistration.name = org.apache.spark.sql.UDFRegistration
logger.UDFRegistration.level = all
Refer to Logging.