ParDo
ParDo
is a utility to create ParDo.SingleOutput transformations (to execute DoFn element-wise functions).
ParDo.of Utility
SingleOutput<InputT, OutputT> of(
DoFn<InputT, OutputT> fn)
ParDo.of
creates a ParDo.SingleOutput transformation.
ParDo.SingleOutput PTransform
SingleOutput<InputT, OutputT>
is a PTransform that is created using ParDo.of utility (for the user-defined DoFn<InputT, OutputT> to be executed on all of the input elements of type InputT
to produce values of type OutputT
).
PTransform<PCollection<? extends InputT>, PCollection<OutputT>>
Demo
import org.apache.beam.sdk.Pipeline
val p = Pipeline.create()
import org.apache.beam.sdk.transforms.Create
val nums = p.apply("Input Numbers", Create.of(0, 1, 2, 3, 4, 5))
import org.apache.beam.sdk.transforms.DoFn
import org.apache.beam.sdk.transforms.DoFn._
val multiplyBy2 = new DoFn[Int, Int]() {
// Must return void or ProcessContinuation
@ProcessElement
def processElement(@Element element: Int): Unit = {
element * 2
}
}
import org.apache.beam.sdk.transforms.ParDo
val singleOutput = ParDo.of(multiplyBy2)
scala> :type singleOutput
org.apache.beam.sdk.transforms.ParDo.SingleOutput[Int,Int]
val multipliedByTwo = nums.apply("Multiply By Two", singleOutput)
scala> :type multipliedByTwo
org.apache.beam.sdk.values.PCollection[Int]