SparkContext¶
Creating Instance¶
SparkContext
takes the following to be created:
- Master URL (default:
None
) - Application Name (default:
None
) - Spark Home (default:
None
) - Py Files (default:
None
) - Environment (default:
None
) - Batch Size (default:
0
) -
PickleSerializer
-
SparkConf
(default:None
) - Gateway (default:
None
) - Corresponding
SparkContext
on JVM (default:None
) -
BasicProfiler
While being created, SparkContext
_ensure_initialized (with the gateway and the conf) followed by _do_init.
Demo¶
from pyspark import SparkContext
JavaGateway¶
SparkContext
defines _gateway
property for a JavaGateway
that is given or launched when _ensure_initialized.
JVMView¶
SparkContext
defines _jvm
property for a JVMView
(py4j) to access to the Java Virtual Machine of the JavaGateway.
_ensure_initialized¶
_ensure_initialized(
cls, instance=None, gateway=None, conf=None)
_ensure_initialized
is a @classmethod
.
_ensure_initialized
takes the given gateway or launch_gateway.
_ensure_initialized
...FIXME
_ensure_initialized
is used when:
_do_init¶
_do_init(
self, master, appName, sparkHome,
pyFiles, environment, batchSize, serializer,
conf, jsc, profiler_cls)
_do_init
...FIXME