DataflowGraphRegistry¶
DataflowGraphRegistry
is a registry of Dataflow Graphs.
Scala object
DataflowGraphRegistry
is an object
in Scala which means it is a class that has exactly one instance (itself). A Scala object
is created lazily when it is referenced for the first time.
Learn more in Tour of Scala.
Demo¶
import org.apache.spark.sql.connect.pipelines.DataflowGraphRegistry
val graphId = DataflowGraphRegistry.createDataflowGraph(
defaultCatalog=spark.catalog.currentCatalog(),
defaultDatabase=spark.catalog.currentDatabase,
defaultSqlConf=Map.empty)
Dataflow Graphs¶
dataflowGraphs: ConcurrentHashMap[String, GraphRegistrationContext]
DataflowGraphRegistry
creates an empty collection of GraphRegistrationContexts by their UUIDs.
createDataflowGraph¶
createDataflowGraph(
defaultCatalog: String,
defaultDatabase: String,
defaultSqlConf: Map[String, String]): String
createDataflowGraph
...FIXME
createDataflowGraph
is used when:
PipelinesHandler
(Spark Connect) is requested to createDataflowGraph
Find Dataflow Graph (or Throw SparkException)¶
getDataflowGraphOrThrow(
dataflowGraphId: String): GraphRegistrationContext
getDataflowGraphOrThrow
looks up the GraphRegistrationContext for the given dataflowGraphId
or throws an SparkException
if it does not exist.
Dataflow graph with id [graphId] could not be found
getDataflowGraphOrThrow
is used when:
PipelinesHandler
(Spark Connect) is requested to defineDataset, defineFlow, defineSqlGraphElements, startRun
Find Dataflow Graph¶
getDataflowGraph(
graphId: String): Option[GraphRegistrationContext]
getDataflowGraph
finds the GraphRegistrationContext for the given graphId
(in this dataflowGraphs registry).
getDataflowGraph
is used when:
DataflowGraphRegistry
is requested to getDataflowGraphOrThrow