EventLoggingListener is a SparkListener that writes out JSON-encoded events of a Spark application with event logging enabled (based on spark.eventLog.enabled configuration property).

EventLoggingListener supports custom configuration properties.

EventLoggingListener writes out log files to a directory (based on spark.eventLog.dir configuration property). All SparkListeners are logged (but SparkListenerBlockUpdated and SparkListenerExecutorMetricsUpdate).

Use Spark History Server to view the event logs in a browser (similarly to web UI of a Spark application).

EventLoggingListener uses .inprogress file extension for in-flight event log files of active Spark applications.

EventLoggingListener can compress events (based on spark.eventLog.compress configuration property).

Creating Instance

EventLoggingListener takes the following to be created:

EventLoggingListener initializes the internal properties.

When initialized with no Hadoop Configuration, EventLoggingListener uses SparkHadoopUtil utility to create a new one.

Event Log File

logPath: String

logPath is…​FIXME

logPath is used when EventLoggingListener is requested to start and stop.

Starting EventLoggingListener

start(): Unit

start deletes the logPath with the .inprogress extension.

The log file’s working name is created based on appId with or without the compression codec used and appAttemptId, i.e. local-1461696754069. It also uses .inprogress extension.

If overwrite is enabled, you should see the WARN message:

Event log [path] already exists. Overwriting...

The working log .inprogress is attempted to be deleted. In case it could not be deleted, the following WARN message is printed out to the logs:

Error deleting [path]

The buffered output stream is created with metadata with Spark’s version and SparkListenerLogStart class' name as the first line.

{"Event":"SparkListenerLogStart","Spark Version":"2.0.0-SNAPSHOT"}

At this point, EventLoggingListener is ready for event logging and you should see the following INFO message in the logs:

Logging events to [logPath]

start throws an IllegalArgumentException when the logBaseDir is not a directory:

Log directory [logBaseDir] is not a directory.
start is executed while SparkContext is created.

Logging Event (In JSON Format)

  event: SparkListenerEvent,
  flushLogger: Boolean = false): Unit

logEvent logs the given event as JSON.


Stopping EventLoggingListener

stop(): Unit

stop closes PrintWriter for the log file and renames the file to be without .inprogress extension.

If the target log file exists (one without .inprogress extension), it overwrites the file if spark.eventLog.overwrite is enabled. You should see the following WARN message in the logs:

Event log [target] already exists. Overwriting...

If the target log file exists and overwrite is disabled, an java.io.IOException is thrown with the following message:

Target log file already exists ([logPath])
stop is executed while SparkContext is requested to stop.

getLogPath Utility

  logBaseDir: URI,
  appId: String,
  appAttemptId: Option[String],
  compressionCodecName: Option[String] = None): String


getLogPath is used when EventLoggingListener is created (for the logPath).

openEventLog Utility

  log: Path,
  fs: FileSystem): InputStream


openEventLog is used when…​FIXME


Enable ALL logging level for org.apache.spark.scheduler.EventLoggingListener logger to see what happens inside.

Add the following line to conf/log4j.properties:


Refer to Logging.

Internal Properties


Hadoop FSDataOutputStream (default: None)

Used when…​FIXME


Initialized when EventLoggingListener is requested to start

Used when EventLoggingListener is requested to logEvent

Closed when EventLoggingListener is requested to stop