Skip to content

SparkEnv

Learn More

This is a stub for pythonWorkers et al. Learn more in The Internals of Apache Spark.

pythonWorkers Registry

pythonWorkers: Map[(String, Map[String, String]), PythonWorkerFactory]

SparkEnv creates an empty collection of PythonWorkerFactorys (by their pythonExec and the envVars) when created.

A new PythonWorkerFactory is created in createPythonWorker when there was no PythonWorkerFactory for a pythonExec and a envVars pair.

All PythonWorkerFactorys are requested to stop when SparkEnv is requested to stop.

pythonWorkers is used in destroyPythonWorker and releasePythonWorker.

Looking Up or Creating Python Worker Process

createPythonWorker(
  pythonExec: String,
  envVars: Map[String, String]): (java.net.Socket, Option[Int])

createPythonWorker looks up a PythonWorkerFactory (in pythonWorkers) for the given pythonExec and the envVars pair. Unless found, createPythonWorker registers a new PythonWorkerFactory.

In the end, createPythonWorker requests the PythonWorkerFactory to create a Python worker process.


createPythonWorker is used when: