Skip to content

PythonWorkerFactory

Creating Instance

PythonWorkerFactory takes the following to be created:

PythonWorkerFactory is created when SparkEnv is requested to createPythonWorker (when BasePythonRunner is requested to compute a partition).

useDaemon Flag

PythonWorkerFactory uses useDaemon internal flag that is the value of spark.python.use.daemon configuration property to decide whether to use lighter daemon or non-daemon workers.

useDaemon flag is used when PythonWorkerFactory requested to create, stop or release a worker and stop a daemon module.

Python Daemon Module

PythonWorkerFactory uses spark.python.daemon.module configuration property to define the Python Daemon Module.

The Python Daemon Module is used when PythonWorkerFactory is requested to create and start a daemon module.

Python Worker Module

PythonWorkerFactory uses spark.python.worker.module configuration property to specify the Python Worker Module.

The Python Worker Module is used when PythonWorkerFactory is requested to create and start a worker.

Creating Python Worker

create(): Socket

create...FIXME

create is used when SparkEnv is requested to createPythonWorker.

Creating Daemon Worker

createThroughDaemon(): Socket

createThroughDaemon...FIXME

createThroughDaemon is used when PythonWorkerFactory is requested to create a Python worker (with useDaemon flag enabled).

Starting Python Daemon Process

startDaemon(): Unit

startDaemon...FIXME

Creating Simple Non-Daemon Worker

createSimpleWorker(): Socket

createSimpleWorker...FIXME

createSimpleWorker is used when PythonWorkerFactory is requested to create a Python worker (with useDaemon flag disabled).

Logging

Enable ALL logging level for org.apache.spark.api.python.PythonWorkerFactory logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.api.python.PythonWorkerFactory=ALL

Refer to Logging.


Last update: 2021-02-19