Skip to content

DataflowGraph

DataflowGraph is a GraphRegistrationContext with tables, sinks, views and flows fully-qualified, resolved and de-duplicated.

Creating Instance

DataflowGraph takes the following to be created:

DataflowGraph is created when:

Outputs

output: Map[TableIdentifier, Output]
Lazy Value

output is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.

Learn more in the Scala Language Specification.

output is a collection of unique Outputs (tables and sinks) by their TableIdentifier.


output is used when:

reanalyzeFlow

reanalyzeFlow(
  srcFlow: Flow): ResolvedFlow

reanalyzeFlow finds the upstream datasets.

reanalyzeFlow finds the upstream flows (for the upstream datasets that could be found in the resolvedFlows registry).

reanalyzeFlow finds the upstream views (for the upstream datasets that could be found in the view registry).

reanalyzeFlow creates a new (sub)DataflowGraph for the upstream flows, views and a single table (the destination of the given Flow).

reanalyzeFlow requests the subgraph to resolve and returns the ResolvedFlow for the given Flow.


reanalyzeFlow is used when:

Resolve

resolve(): DataflowGraph

resolve...FIXME


resolve is used when:

Validate

validate(): DataflowGraph

validate...FIXME


validate is used when: