UnresolvedFlow¶
UnresolvedFlow
is a Flow that represents a flow in the Python and SQL transformations in Spark Declarative Pipelines:
- register_flow in PySpark's decorators
- CREATE FLOW ... AS INSERT INTO ... BY NAME
- CREATE MATERIALIZED VIEW
- CREATE STREAMING TABLE ... AS
- CREATE VIEW and the other variants of CREATE VIEW
UnresolvedFlow
is registered to a GraphRegistrationContext with register a flow.
UnresolvedFlow
is analyzed and resolved to ResolvedFlow (by FlowResolver when DataflowGraph is requested to resolve).
UnresolvedFlow
must have unique identifiers (or an AnalysisException
is reported).
Creating Instance¶
UnresolvedFlow
takes the following to be created:
-
TableIdentifier
- Flow destination (
TableIdentifier
) -
FlowFunction
-
QueryContext
- SQL Config
- once flag
-
QueryOrigin
UnresolvedFlow
is created when:
PipelinesHandler
is requested to define a flowSqlGraphRegistrationContext
is requested to handle the following logical commands:
once Flag¶
UnresolvedFlow
is given the once flag when created.
once
flag is disabled (false
) explicitly for the following:
- CreateFlowHandler
- CreateMaterializedViewAsSelectHandler
- CreatePersistedViewCommandHandler
- CreateStreamingTableAsSelectHandler
- CreateTemporaryViewHandler
PipelinesHandler
is requested to define a flow
No ONCE UnresolvedFlows
It turns out that all UnresolvedFlow
s created are not ONCE flows.
As per this commit, it is said that:
However, the server does not currently implement this behavior yet. To avoid accidentally releasing APIs that don't actually work, we should take these arguments out for now. And add them back in when we actually support this functionality.