Skip to content

PythonArrowOutput

PythonArrowOutput is an extension of the BasePythonRunner abstraction for vectorized (ColumnarBatch) runners.

Scala Definition
trait PythonArrowOutput[OUT <: AnyRef] {
    self: BasePythonRunner[_, OUT] =>
    // ...
}

Contract

Deserializing ColumnarBatch

deserializeColumnarBatch(
  batch: ColumnarBatch,
  schema: StructType): OUT

See:

Used when:

Performance Metrics

pythonMetrics: Map[String, SQLMetric]

SQLMetrics (Spark SQL):

  • pythonNumRowsReceived
  • pythonDataReceived

Used when:

Implementations