EventLoggingListener supports custom configuration properties.
EventLoggingListener writes out log files to a directory (based on spark.eventLog.dir configuration property). All SparkListeners are logged (but SparkListenerBlockUpdated and SparkListenerExecutorMetricsUpdate).
|Use Spark History Server to view the event logs in a browser (similarly to web UI of a Spark application).|
EventLoggingListener can compress events (based on spark.eventLog.compress configuration property).
EventLoggingListener takes the following to be created:
EventLoggingListener initializes the internal properties.
The log file’s working name is created based on
appId with or without the compression codec used and
local-1461696754069. It also uses
If overwrite is enabled, you should see the WARN message:
Event log [path] already exists. Overwriting...
The working log
.inprogress is attempted to be deleted. In case it could not be deleted, the following WARN message is printed out to the logs:
Error deleting [path]
The buffered output stream is created with metadata with Spark’s version and
SparkListenerLogStart class' name as the first line.
At this point, EventLoggingListener is ready for event logging and you should see the following INFO message in the logs:
Logging events to [logPath]
start throws an
IllegalArgumentException when the logBaseDir is not a directory:
Log directory [logBaseDir] is not a directory.
logEvent( event: SparkListenerEvent, flushLogger: Boolean = false): Unit
logEvent logs the given
event as JSON.
PrintWriter for the log file and renames the file to be without
If the target log file exists (one without
.inprogress extension), it overwrites the file if spark.eventLog.overwrite is enabled. You should see the following WARN message in the logs:
Event log [target] already exists. Overwriting...
If the target log file exists and overwrite is disabled, an
java.io.IOException is thrown with the following message:
Target log file already exists ([logPath])
openEventLog( log: Path, fs: FileSystem): InputStream
openEventLog is used when…FIXME
ALL logging level for
org.apache.spark.scheduler.EventLoggingListener logger to see what happens inside.
Add the following line to
Refer to Logging.
Hadoop FSDataOutputStream (default: