PySpark Setup¶
Install IPython¶
Follow the steps as described in the official documentation of IPython.
pip install ipython
Start PySpark¶
export PYSPARK_DRIVER_PYTHON=ipython
For Java 11, use -Dio.netty.tryReflectionSetAccessible=true
(see Downloading in the official documentation of Apache Spark).
./bin/pyspark --driver-java-options=-Dio.netty.tryReflectionSetAccessible=true
Python 3.9.1 (default, Feb 3 2021, 07:38:02)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.20.0 -- An enhanced Interactive Python. Type '?' for help.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 3.1.1
/_/
Using Python version 3.9.1 (default, Feb 3 2021 07:38:02)
Spark context Web UI available at http://192.168.68.101:4040
Spark context available as 'sc' (master = local[*], app id = local-1613571272142).
SparkSession available as 'spark'.
In [1]:
In [1]: spark.version
Out[1]: '3.1.1'