GraphRegistrationContext¶
GraphRegistrationContext is a registry of tables, views, and flows in a pipeline (dataflow graph).
GraphRegistrationContext is required to create a new SqlGraphRegistrationContext.
Eventually, GraphRegistrationContext becomes a DataflowGraph (to create a PipelineUpdateContextImpl to run a pipeline).
Creating Instance¶
GraphRegistrationContext takes the following to be created:
- Default Catalog
- Default Database
- Default SQL Configuration Properties
GraphRegistrationContext is created when:
DataflowGraphRegistryis requested to createDataflowGraph
Create DataflowGraph¶
toDataflowGraph: DataflowGraph
toDataflowGraph creates a new DataflowGraph with the tables, views, sinks and flows fully-qualified, resolved, and de-duplicated.
AnalysisException
toDataflowGraph reports an AnalysisException when this GraphRegistrationContext is empty.
toDataflowGraph is used when:
PipelinesHandleris requested to start a pipeline run
isPipelineEmpty¶
isPipelineEmpty: Boolean
isPipelineEmpty is true when this pipeline (this GraphRegistrationContext) is empty, i.e., for all the following met:
- No tables registered
- No PersistedViews registered (among the views)
- No sinks registered
assertNoDuplicates¶
assertNoDuplicates(
qualifiedTables: Seq[Table],
validatedViews: Seq[View],
qualifiedFlows: Seq[UnresolvedFlow]): Unit
assertNoDuplicates...FIXME
assertFlowIdentifierIsUnique¶
assertFlowIdentifierIsUnique(
flow: UnresolvedFlow,
datasetType: DatasetType,
flows: Seq[UnresolvedFlow]): Unit
assertFlowIdentifierIsUnique throws an AnalysisException if the given UnresolvedFlow's identifier is used by multiple flows (among the given flows):
Flow [flow_name] was found in multiple datasets: [dataset_names]
Tables¶
GraphRegistrationContext creates an empty registry of Tables when created.
A new Table is added when GraphRegistrationContext is requested to register a table.
Views¶
GraphRegistrationContext creates an empty registry of Views when created.
Sinks¶
GraphRegistrationContext creates an empty registry of Sinks when created.
A new sink is registered using registerSink (when PipelinesHandler is requested to define a sink).
All the sinks registered are available via getSinks.
A pipeline is considered empty if there are no sinks (among the other persistent entities).
Eventually, GraphRegistrationContext uses the sinks to create a DataflowGraph.
Flows¶
GraphRegistrationContext creates an empty registry of UnresolvedFlows when created.
Register Flow¶
registerFlow(
flowDef: UnresolvedFlow): Unit
registerFlow adds the given UnresolvedFlow to the flows registry.
registerFlow is used when:
PipelinesHandleris requested to define a flowSqlGraphRegistrationContextis requested to process the following SQL queries:
Register Sink¶
registerSink(
sinkDef: Sink): Unit
registerSink adds the given Sink to the sinks registry.
registerSink is used when:
PipelinesHandleris requested to define an output
Register Table¶
registerTable(
tableDef: Table): Unit
registerTable adds the given Table to the tables registry.
registerTable is used when:
PipelinesHandleris requested to define an outputSqlGraphRegistrationContextis requested to process the following SQL queries: