pyspark Shell Script¶
pyspark shell script runs spark-submit with pyspark-shell-main application resource as the first argument followed by --name "PySparkShell" option (with other command-line arguments, if specified).
pyspark/shell.py¶
pyspark/shell.py
Learn more about pyspark/shell.py in The Internals of PySpark.
pyspark/shell.py module is launched as a PYTHONSTARTUP script.
Environment Variables¶
pyspark script exports the following environment variables:
- OLD_PYTHONSTARTUP
- PYSPARK_DRIVER_PYTHON
- PYSPARK_DRIVER_PYTHON_OPTS
- PYSPARK_PYTHON
- PYTHONPATH
- PYTHONSTARTUP
OLD_PYTHONSTARTUP¶
pyspark defines OLD_PYTHONSTARTUP environment variable to be the initial value of PYTHONSTARTUP (before it gets redefined).
The idea of OLD_PYTHONSTARTUP is to delay execution of the Python startup script until pyspark/shell.py finishes.
PYSPARK_PYTHON¶
PYSPARK_PYTHON environment variable can be used to specify a Python executable to run PySpark scripts.
The Internals of PySpark
Learn more about PySpark in The Internals of PySpark.
PYSPARK_PYTHON can be overriden by PYSPARK_DRIVER_PYTHON and configuration properties when SparkSubmitCommandBuilder is requested to buildPySparkShellCommand.
PYSPARK_PYTHON is overriden by spark.pyspark.python configuration property, if defined, when SparkSubmitCommandBuilder is requested to buildPySparkShellCommand.
PYTHONSTARTUP¶
From Python Documentation:
PYTHONSTARTUP
If this is the name of a readable file, the Python commands in that file are executed before the first prompt is displayed in interactive mode. The file is executed in the same namespace where interactive commands are executed so that objects defined or imported in it can be used without qualification in the interactive session. You can also change the prompts
sys.ps1andsys.ps2and the hooksys.__interactivehook__in this file.
pyspark (re)defines PYTHONSTARTUP environment variable to be pyspark/shell.py module:
${SPARK_HOME}/python/pyspark/shell.py
OLD_PYTHONSTARTUP
The initial value of PYTHONSTARTUP environment variable is available as OLD_PYTHONSTARTUP.