Skip to content

MetricsSystem

MetricsSystem is a registry of metrics sources and sinks of a Spark subsystem.

Creating Instance

MetricsSystem takes the following to be created:

While being created, MetricsSystem requests the MetricsConfig to initialize.

Creating MetricsSystem

MetricsSystem is created (using createMetricsSystem utility) for the Metrics Systems.

Creating MetricsSystem

createMetricsSystem(
  instance: String
  conf: SparkConf
  securityMgr: SecurityManager): MetricsSystem

createMetricsSystem creates a new MetricsSystem (for the given parameters).

createMetricsSystem is used to create metrics systems.

Metrics Sources for Spark SQL

  • CodegenMetrics
  • HiveCatalogMetrics

Registering Metrics Source

registerSource(
  source: Source): Unit

registerSource adds source to the sources internal registry.

registerSource creates an identifier for the metrics source and registers it with the MetricRegistry.

registerSource uses Metrics' MetricRegistry.register to register a metrics source under a given name.

registerSource prints out the following INFO message to the logs when registering a name more than once:

Metrics already registered

Building Metrics Source Identifier

buildRegistryName(
  source: Source): String

buildRegistryName uses spark-metrics-properties.md#spark.metrics.namespace[spark.metrics.namespace] and executor:Executor.md#spark.executor.id[spark.executor.id] Spark properties to differentiate between a Spark application's driver and executors, and the other Spark framework's components.

(only when <> is driver or executor) buildRegistryName builds metrics source name that is made up of spark-metrics-properties.md#spark.metrics.namespace[spark.metrics.namespace], executor:Executor.md#spark.executor.id[spark.executor.id] and the name of the source.

Note

buildRegistryName uses Dropwizard Metrics' MetricRegistry to build metrics source identifiers.

FIXME Finish for the other components.

buildRegistryName is used when MetricsSystem is requested to register or remove a metrics source.

Registering Metrics Sources for Spark Instance

registerSources(): Unit

registerSources finds <> configuration for the <>.

NOTE: instance is defined when MetricsSystem <>.

registerSources finds the configuration of all the spark-metrics-Source.md[metrics sources] for the subsystem (as described with source. prefix).

For every metrics source, registerSources finds class property, creates an instance, and in the end <>.

When registerSources fails, you should see the following ERROR message in the logs followed by the exception.

Source class [classPath] cannot be instantiated

registerSources is used when MetricsSystem is requested to start.

Requesting JSON Servlet Handler

getServletHandlers: Array[ServletContextHandler]

If the MetricsSystem is <> and the <> is defined for the metrics system, getServletHandlers simply requests the <> for the spark-metrics-MetricsServlet.md#getHandlers[JSON servlet handler].

When MetricsSystem is not <> getServletHandlers throws an IllegalArgumentException.

Can only call getServletHandlers on a running MetricsSystem

getServletHandlers is used when:

  • SparkContext is created
  • (Spark Standalone) Master and Worker are requested to start

Registering Metrics Sinks

registerSinks(): Unit

registerSinks requests the <> for the spark-metrics-MetricsConfig.md#getInstance[configuration] of the <>.

registerSinks requests the <> for the spark-metrics-MetricsConfig.md#subProperties[configuration] of all metrics sinks (i.e. configuration entries that match ^sink\\.(.+)\\.(.+) regular expression).

For every metrics sink configuration, registerSinks takes class property and (if defined) creates an instance of the metric sink using an constructor that takes the configuration, <> and <>.

For a single servlet metrics sink, registerSinks converts the sink to a spark-metrics-MetricsServlet.md[MetricsServlet] and sets the <> internal registry.

For all other metrics sinks, registerSinks adds the sink to the <> internal registry.

In case of an Exception, registerSinks prints out the following ERROR message to the logs:

Sink class [classPath] cannot be instantiated

registerSinks is used when MetricsSystem is requested to start.

Stopping

stop(): Unit

stop...FIXME

Reporting Metrics

report(): Unit

report simply requests the registered metrics sinks to report metrics.

Starting

start(): Unit

start turns <> flag on.

NOTE: start can only be called once and <> an IllegalArgumentException when called multiple times.

start <> the <> for Spark SQL, i.e. CodegenMetrics and HiveCatalogMetrics.

start then registers the configured metrics <> and <> for the <>.

In the end, start requests the registered <> to spark-metrics-Sink.md#start[start].

[[start-IllegalArgumentException]] start throws an IllegalArgumentException when <> flag is on.

requirement failed: Attempting to start a MetricsSystem that is already running

Logging

Enable ALL logging level for org.apache.spark.metrics.MetricsSystem logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.metrics.MetricsSystem=ALL

Refer to Logging.

Internal Registries

MetricRegistry

Integration point to Dropwizard Metrics' MetricRegistry

Used when MetricsSystem is requested to:

MetricsConfig

MetricsConfig

Initialized when MetricsSystem is <>.

Used when MetricsSystem registers <> and <>.

MetricsServlet

MetricsServlet JSON metrics sink that is only available for the <> with a web UI (i.e. the driver of a Spark application and Spark Standalone's Master).

MetricsSystem may have at most one MetricsServlet JSON metrics sink (which is registered by default).

Initialized when MetricsSystem registers <> (and finds a configuration entry with servlet sink name).

Used when MetricsSystem is requested for a <>.

running Flag

Indicates whether MetricsSystem has been started (true) or not (false)

Default: false

sinks

Metrics sinks

Used when MetricsSystem <> and <>.

sources

Metrics sources

Used when MetricsSystem <>.


Last update: 2020-10-08