PipelineRunner

PipelineRunner<ResultT extends PipelineResult> is an abstraction to run a Pipeline (and produce a PipelineResult).

PipelineRunner is defined using PipelineOptions.

val result = pipeline.run

assert(result.isInstanceOf[org.apache.beam.sdk.PipelineResult])

result.waitUntilFinish
import org.apache.beam.sdk.options.PipelineOptionsFactory
val options = PipelineOptionsFactory.create()

import org.apache.beam.sdk.PipelineRunner
val runner = PipelineRunner.fromOptions(options)

scala> :type runner
org.apache.beam.sdk.PipelineRunner[_ <: org.apache.beam.sdk.PipelineResult]

Available PipelineRunners

PipelineRunner Description

DataflowRunner

DirectRunner

FlinkRunner

PortableRunner

SparkRunnerDebugger

SparkStructuredStreamingRunner

TestSparkRunner

You can find more runners in the official documentation.

Contract

run Method

// ResultT extends PipelineResult
ResultT run(
  Pipeline pipeline)

run runs the given Pipeline and produces a PipelineResult (as a ResultT)

Used when Pipeline is requested to run.