Skip to content

GraphRegistrationContext

GraphRegistrationContext is a registry of tables, views, and flows in a pipeline (dataflow graph).

GraphRegistrationContext is required to create a new SqlGraphRegistrationContext.

Eventually, GraphRegistrationContext becomes a DataflowGraph (to create a PipelineUpdateContextImpl to run a pipeline).

Creating Instance

GraphRegistrationContext takes the following to be created:

  • Default Catalog
  • Default Database
  • Default SQL Configuration Properties

GraphRegistrationContext is created when:

Create DataflowGraph

toDataflowGraph: DataflowGraph

toDataflowGraph creates a new DataflowGraph with the tables, views, sinks and flows fully-qualified, resolved, and de-duplicated.

AnalysisException

toDataflowGraph reports an AnalysisException when this GraphRegistrationContext is empty.


toDataflowGraph is used when:

isPipelineEmpty

isPipelineEmpty: Boolean

isPipelineEmpty is true when this pipeline (this GraphRegistrationContext) is empty, i.e., for all the following met:

  1. No tables registered
  2. No PersistedViews registered (among the views)
  3. No sinks registered

assertNoDuplicates

assertNoDuplicates(
  qualifiedTables: Seq[Table],
  validatedViews: Seq[View],
  qualifiedFlows: Seq[UnresolvedFlow]): Unit

assertNoDuplicates...FIXME

assertFlowIdentifierIsUnique

assertFlowIdentifierIsUnique(
  flow: UnresolvedFlow,
  datasetType: DatasetType,
  flows: Seq[UnresolvedFlow]): Unit

assertFlowIdentifierIsUnique throws an AnalysisException if the given UnresolvedFlow's identifier is used by multiple flows (among the given flows):

Flow [flow_name] was found in multiple datasets: [dataset_names]

Tables

GraphRegistrationContext creates an empty registry of Tables when created.

A new Table is added when GraphRegistrationContext is requested to register a table.

Views

GraphRegistrationContext creates an empty registry of Views when created.

Sinks

GraphRegistrationContext creates an empty registry of Sinks when created.

A new sink is registered using registerSink (when PipelinesHandler is requested to define a sink).

All the sinks registered are available via getSinks.

A pipeline is considered empty if there are no sinks (among the other persistent entities).

Eventually, GraphRegistrationContext uses the sinks to create a DataflowGraph.

Flows

GraphRegistrationContext creates an empty registry of UnresolvedFlows when created.

Register Flow

registerFlow(
  flowDef: UnresolvedFlow): Unit

registerFlow adds the given UnresolvedFlow to the flows registry.


registerFlow is used when:

Register Sink

registerSink(
  sinkDef: Sink): Unit

registerSink adds the given Sink to the sinks registry.


registerSink is used when:

Register Table

registerTable(
  tableDef: Table): Unit

registerTable adds the given Table to the tables registry.


registerTable is used when: