pyspark Shell Script¶
Learn more about
pyspark/shell.py in The Internals of PySpark.
pyspark/shell.py module is launched as a PYTHONSTARTUP script.
pyspark script exports the following environment variables:
OLD_PYTHONSTARTUP environment variable to be the initial value of PYTHONSTARTUP (before it gets redefined).
The idea of
OLD_PYTHONSTARTUP is to delay execution of the Python startup script until pyspark/shell.py finishes.
PYSPARK_PYTHON environment variable can be used to specify a Python executable to run PySpark scripts.
The Internals of PySpark
Learn more about PySpark in The Internals of PySpark.
PYSPARK_PYTHON is overriden by
spark.pyspark.python configuration property, if defined, when
SparkSubmitCommandBuilder is requested to buildPySparkShellCommand.
From Python Documentation:
If this is the name of a readable file, the Python commands in that file are executed before the first prompt is displayed in interactive mode. The file is executed in the same namespace where interactive commands are executed so that objects defined or imported in it can be used without qualification in the interactive session. You can also change the prompts
sys.ps2and the hook
sys.__interactivehook__in this file.
PYTHONSTARTUP environment variable to be pyspark/shell.py module:
The initial value of
PYTHONSTARTUP environment variable is available as OLD_PYTHONSTARTUP.