Skip to content


Every user program starts with creating an instance of SparkConf that holds the[master URL] to connect to (spark.master), the name for your Spark application (that is later displayed in[web UI] and becomes and other Spark properties required for proper runs. The instance of SparkConf can be used to create[SparkContext].


Start[Spark shell] with --conf spark.logConf=true to log the effective Spark configuration as INFO when SparkContext is started.

$ ./bin/spark-shell --conf spark.logConf=true
15/10/19 17:13:49 INFO SparkContext: Running Spark version 1.6.0-SNAPSHOT
15/10/19 17:13:49 INFO SparkContext: Spark configuration: shell

Use sc.getConf.toDebugString to have a richer output once SparkContext has finished initializing.

You can query for the values of Spark properties in Spark shell as follows:

scala> sc.getConf.getOption("spark.local.dir")
res0: Option[String] = None

scala> sc.getConf.getOption("")
res1: Option[String] = Some(Spark shell)

scala> sc.getConf.get("spark.master")
res2: String = local[*]

== [[setIfMissing]] setIfMissing Method


== [[isExecutorStartupConf]] isExecutorStartupConf Method


== [[set]] set Method


== Setting up Spark Properties

There are the following places where a Spark application looks for Spark properties (in the order of importance from the least important to the most important):

  • conf/spark-defaults.conf - the configuration file with the default Spark properties. Read[spark-defaults.conf].
  • --conf or -c - the command-line option used by[spark-submit] (and other shell scripts that use spark-submit or spark-class under the covers, e.g. spark-shell)
  • SparkConf

== [[default-configuration]] Default Configuration

The default Spark configuration is created when you execute the following code:

[source, scala]

import org.apache.spark.SparkConf val conf = new SparkConf

It simply loads spark.* system properties.

You can use conf.toDebugString or conf.getAll to have the spark.* system properties loaded printed out.

[source, scala]

scala> conf.getAll res0: Array[(String, String)] = Array((,Spark shell), (spark.jars,""), (spark.master,local[*]), (spark.submit.deployMode,client))

scala> conf.toDebugString res1: String = shell spark.jars= spark.master=local[*] spark.submit.deployMode=client

scala> println(conf.toDebugString) shell spark.jars= spark.master=local[*] spark.submit.deployMode=client

== [[getAppId]] Unique Identifier of Spark Application -- getAppId Method

[source, scala]

getAppId: String

getAppId returns the value of[] configuration property or throws a NoSuchElementException if not set.

getAppId is used when:

  • NettyBlockTransferService is requested to[init] (and creates a[NettyBlockRpcServer] as well as[saves the identifier for later use]).

  • Executor[is created] (in non-local mode and[requests BlockManager to initialize]).

== [[getAvroSchema]] getAvroSchema Method

[source, scala]

getAvroSchema: Map[Long, String]

getAvroSchema takes all avro.schema-prefixed configuration properties from <> and...FIXME

getAvroSchema is used when KryoSerializer is created (and initializes avroSchemas).

Back to top