Skip to content


Every user program starts with creating an instance of SparkConf that holds the master URL to connect to (spark.master), the name for your Spark application (that is later displayed in[web UI] and becomes and other Spark properties required for proper runs. The instance of SparkConf can be used to create[SparkContext].


Start[Spark shell] with --conf spark.logConf=true to log the effective Spark configuration as INFO when SparkContext is started.

$ ./bin/spark-shell --conf spark.logConf=true
15/10/19 17:13:49 INFO SparkContext: Running Spark version 1.6.0-SNAPSHOT
15/10/19 17:13:49 INFO SparkContext: Spark configuration: shell

Use sc.getConf.toDebugString to have a richer output once SparkContext has finished initializing.

You can query for the values of Spark properties in Spark shell as follows:

scala> sc.getConf.getOption("spark.local.dir")
res0: Option[String] = None

scala> sc.getConf.getOption("")
res1: Option[String] = Some(Spark shell)

scala> sc.getConf.get("spark.master")
res2: String = local[*]

== [[setIfMissing]] setIfMissing Method


== [[isExecutorStartupConf]] isExecutorStartupConf Method


== [[set]] set Method


== Setting up Spark Properties

There are the following places where a Spark application looks for Spark properties (in the order of importance from the least important to the most important):

  • conf/spark-defaults.conf - the configuration file with the default Spark properties. Read[spark-defaults.conf].
  • --conf or -c - the command-line option used by[spark-submit] (and other shell scripts that use spark-submit or spark-class under the covers, e.g. spark-shell)
  • SparkConf

== [[default-configuration]] Default Configuration

The default Spark configuration is created when you execute the following code:

[source, scala]

import org.apache.spark.SparkConf val conf = new SparkConf

It simply loads spark.* system properties.

You can use conf.toDebugString or conf.getAll to have the spark.* system properties loaded printed out.

[source, scala]

scala> conf.getAll res0: Array[(String, String)] = Array((,Spark shell), (spark.jars,""), (spark.master,local[*]), (spark.submit.deployMode,client))

scala> conf.toDebugString res1: String = shell spark.jars= spark.master=local[*] spark.submit.deployMode=client

scala> println(conf.toDebugString) shell spark.jars= spark.master=local[*] spark.submit.deployMode=client

== [[getAppId]] Unique Identifier of Spark Application -- getAppId Method

[source, scala]

getAppId: String

getAppId returns the value of[] configuration property or throws a NoSuchElementException if not set.

getAppId is used when:

  • NettyBlockTransferService is requested to[init] (and creates a[NettyBlockRpcServer] as well as[saves the identifier for later use]).

  • Executor[is created] (in non-local mode and[requests BlockManager to initialize]).

== [[getAvroSchema]] getAvroSchema Method

[source, scala]

getAvroSchema: Map[Long, String]

getAvroSchema takes all avro.schema-prefixed configuration properties from <> and...FIXME

getAvroSchema is used when KryoSerializer is created (and initializes avroSchemas).