PipelineExecution¶
PipelineExecution
manages the lifecycle of a GraphExecution (in the given PipelineUpdateContext).
PipelineExecution
is part of PipelineUpdateContext.
Creating Instance¶
PipelineExecution
takes the following to be created:
PipelineExecution
is created alongside PipelineUpdateContext.
Run Pipeline (and Wait for Completion)¶
runPipeline(): Unit
runPipeline
starts the pipeline and requests the PipelineExecution (of this PipelineUpdateContext) to wait for the execution to complete.
runPipeline
is used when:
PipelinesHandler
is requested to startRun (for Spark Connect)
Start Pipeline¶
startPipeline(): Unit
startPipeline
resolves and validates the pipeline graph.
startPipeline
creates a new TriggeredGraphExecution.
In the end, startPipeline
requests the GraphExecution to start.
startPipeline
is used when:
PipelineExecution
is requested to runPipeline
Await Completion¶
awaitCompletion(): Unit
awaitCompletion
requests this GraphExecution to awaitCompletion.
awaitCompletion
is used when:
PipelineExecution
is requested to runPipeline
Initialize Dataflow Graph¶
initializeGraph(): DataflowGraph
initializeGraph
requests this PipelineUpdateContext for the unresolved DataflowGraph to be resolved and validated.
In the end, initializeGraph
materializes the tables (datasets).
initializeGraph
is used when:
PipelineExecution
is requested to start the pipeline
GraphExecution¶
graphExecution: Option[GraphExecution]
graphExecution
is the GraphExecution of the pipeline.
PipelineExecution
creates a TriggeredGraphExecution when startPipeline.
Used in:
Is Execution Started¶
executionStarted: Boolean
executionStarted
is a flag that indicates whether this GraphExecution has been created or not.
executionStarted
is used when:
SessionHolder
(Spark Connect) is requested toremoveCachedPipelineExecution
stopPipeline¶
stopPipeline(): Unit
stopPipeline
requests this GraphExecution to stop.
In case this GraphExecution
has not been created (started) yet, stopPipeline
reports a IllegalStateException
:
Pipeline execution has not started yet.
stopPipeline
is used when:
SessionHolder
(Spark Connect) is requested toremoveCachedPipelineExecution