GraphRegistrationContext¶
GraphRegistrationContext
is a registry of tables, views, and flows in a dataflow graph (described with Python decorators and SQL statements).
GraphRegistrationContext
is required to create a new SqlGraphRegistrationContext.
Creating Instance¶
GraphRegistrationContext
takes the following to be created:
- Default Catalog
- Default Database
- Default SQL Configuration Properties
GraphRegistrationContext
is created when:
DataflowGraphRegistry
is requested to createDataflowGraph
Create DataflowGraph¶
toDataflowGraph: DataflowGraph
toDataflowGraph
creates a new DataflowGraph with the tables, views, and flows fully-qualified, resolved, and de-duplicated.
AnalysisException
toDataflowGraph
reports an AnalysisException
for a GraphRegistrationContext
with no tables and no PersistedView
s (in the views registry).
toDataflowGraph
is used when:
PipelinesHandler
is requested to start a pipeline run
Tables¶
GraphRegistrationContext
creates an empty registry of Tables when created.
A new Table is added when registerTable.
Views¶
GraphRegistrationContext
creates an empty registry of Views when created.
Flows¶
GraphRegistrationContext
creates an empty registry of UnresolvedFlows when created.
Register Table¶
registerTable(
tableDef: Table): Unit
registerTable
adds the given Table to the tables registry.
registerTable
is used when:
PipelinesHandler
(Spark Connect) is requested to handle DEFINE_DATASET command
Register Flow¶
registerFlow(
flowDef: UnresolvedFlow): Unit
registerFlow
adds the given UnresolvedFlow to the flows registry.
registerFlow
is used when:
PipelinesHandler
(Spark Connect) is requested to handle DEFINE_FLOW commandSqlGraphRegistrationContext
is requested to process the following SQL commands:- CreateFlowCommand
CreateMaterializedViewAsSelect
CreateView
CreateStreamingTableAsSelect
CreateViewCommand