Skip to content

ExtractPythonUDFs Physical Query Optimization

[[apply]] ExtractPythonUDFs is a physical query optimization (aka physical query preparation rule or simply preparation rule) that QueryExecution uses to optimize the physical plan of a structured query by <> (excluding FlatMapGroupsInPandasExec operators that it simply skips over).

Technically, ExtractPythonUDFs is just a catalyst/Rule.md[Catalyst rule] for transforming SparkPlan.md[physical query plans], i.e. Rule[SparkPlan].

ExtractPythonUDFs is part of preparations batch of physical query plan rules and is executed when QueryExecution is requested for the optimized physical query plan (i.e. in executedPlan phase of a query execution).

=== [[extract]] Extracting Python UDFs from Physical Query Plan -- extract Internal Method

[source, scala]

extract(plan: SparkPlan): SparkPlan

extract...FIXME

NOTE: extract is used exclusively when ExtractPythonUDFs is requested to <>.

=== [[trySplitFilter]] trySplitFilter Internal Method

[source, scala]

trySplitFilter(plan: SparkPlan): SparkPlan

trySplitFilter...FIXME

NOTE: trySplitFilter is used exclusively when ExtractPythonUDFs is requested to <>.