PythonRunner is a command-line application to launch Python applications.
PythonRunner is used by spark-submit.
PythonRunner executes a configured python executable as a subprocess and then has it connect back to the JVM to access system properties, etc.
PythonRunner requires the following command-line arguments:
- Main python file (
- Extra python files (
- Application arguments
main takes the arguments from command line.
main determines what python executable to use based on (in that order):
- spark.pyspark.driver.python configuration property
- spark.pyspark.python configuration property
main waits until the gateway server has started.
main launches a Python process using the python executable and the following environment variables.
| ||spark.pyspark.python if defined|
| || |
| || |
main waits for the Python process to finish and requests the
Py4JServer to shutdown.