Skip to content

MetricsSystem

MetricsSystem is a registry of metrics sources and sinks of a Spark subsystem.

Creating Instance

MetricsSystem takes the following to be created:

While being created, MetricsSystem requests the MetricsConfig to initialize.

Creating MetricsSystem

MetricsSystem is created (using createMetricsSystem utility) for the Metrics Systems.

PrometheusServlet

MetricsSystem creates a PrometheusServlet when requested to registerSinks for an instance with sink.prometheusServlet configuration.

MetricsSystem requests the PrometheusServlet for URL handlers when requested for servlet handlers (so it can be attached to a web UI and serve HTTP requests).

MetricsServlet

Note

review me

MetricsServlet JSON metrics sink that is only available for the <> with a web UI (i.e. the driver of a Spark application and Spark Standalone's Master).

MetricsSystem may have at most one MetricsServlet JSON metrics sink (which is registered by default).

Initialized when MetricsSystem registers <> (and finds a configuration entry with servlet sink name).

Used when MetricsSystem is requested for a <>.

Creating MetricsSystem

createMetricsSystem(
  instance: String
  conf: SparkConf
  securityMgr: SecurityManager): MetricsSystem

createMetricsSystem creates a new MetricsSystem (for the given parameters).

createMetricsSystem is used to create metrics systems.

Metrics Sources for Spark SQL

  • CodegenMetrics
  • HiveCatalogMetrics

Registering Metrics Source

registerSource(
  source: Source): Unit

registerSource adds source to the sources internal registry.

registerSource creates an identifier for the metrics source and registers it with the MetricRegistry.

registerSource registers the metrics source under a given name.

registerSource prints out the following INFO message to the logs when registering a name more than once:

Metrics already registered

Building Metrics Source Identifier

buildRegistryName(
  source: Source): String

buildRegistryName uses spark-metrics-properties.md#spark.metrics.namespace[spark.metrics.namespace] and executor:Executor.md#spark.executor.id[spark.executor.id] Spark properties to differentiate between a Spark application's driver and executors, and the other Spark framework's components.

(only when <> is driver or executor) buildRegistryName builds metrics source name that is made up of spark-metrics-properties.md#spark.metrics.namespace[spark.metrics.namespace], executor:Executor.md#spark.executor.id[spark.executor.id] and the name of the source.

FIXME Finish for the other components.

buildRegistryName is used when MetricsSystem is requested to register or remove a metrics source.

Registering Metrics Sources for Spark Instance

registerSources(): Unit

registerSources finds <> configuration for the <>.

NOTE: instance is defined when MetricsSystem <>.

registerSources finds the configuration of all the spark-metrics-Source.md[metrics sources] for the subsystem (as described with source. prefix).

For every metrics source, registerSources finds class property, creates an instance, and in the end <>.

When registerSources fails, you should see the following ERROR message in the logs followed by the exception.

Source class [classPath] cannot be instantiated

registerSources is used when MetricsSystem is requested to start.

Servlet Handlers

getServletHandlers: Array[ServletContextHandler]

getServletHandlers requests the metricsServlet (if defined) and the prometheusServlet (if defined) for URL handlers.

getServletHandlers requires that the MetricsSystem is running or throws an IllegalArgumentException:

Can only call getServletHandlers on a running MetricsSystem

getServletHandlers is used when:

Registering Metrics Sinks

registerSinks(): Unit

registerSinks requests the <> for the spark-metrics-MetricsConfig.md#getInstance[configuration] of the <>.

registerSinks requests the <> for the spark-metrics-MetricsConfig.md#subProperties[configuration] of all metrics sinks (i.e. configuration entries that match ^sink\\.(.+)\\.(.+) regular expression).

For every metrics sink configuration, registerSinks takes class property and (if defined) creates an instance of the metric sink using an constructor that takes the configuration, <> and <>.

For a single servlet metrics sink, registerSinks converts the sink to a spark-metrics-MetricsServlet.md[MetricsServlet] and sets the <> internal registry.

For all other metrics sinks, registerSinks adds the sink to the <> internal registry.

In case of an Exception, registerSinks prints out the following ERROR message to the logs:

Sink class [classPath] cannot be instantiated

registerSinks is used when MetricsSystem is requested to start.

Stopping

stop(): Unit

stop...FIXME

Reporting Metrics

report(): Unit

report simply requests the registered metrics sinks to report metrics.

Starting

start(): Unit

start turns <> flag on.

NOTE: start can only be called once and <> an IllegalArgumentException when called multiple times.

start <> the <> for Spark SQL, i.e. CodegenMetrics and HiveCatalogMetrics.

start then registers the configured metrics <> and <> for the <>.

In the end, start requests the registered <> to spark-metrics-Sink.md#start[start].

[[start-IllegalArgumentException]] start throws an IllegalArgumentException when <> flag is on.

requirement failed: Attempting to start a MetricsSystem that is already running

Logging

Enable ALL logging level for org.apache.spark.metrics.MetricsSystem logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.metrics.MetricsSystem=ALL

Refer to Logging.

Internal Registries

MetricRegistry

Integration point to Dropwizard Metrics' MetricRegistry

Used when MetricsSystem is requested to:

MetricsConfig

MetricsConfig

Initialized when MetricsSystem is <>.

Used when MetricsSystem registers <> and <>.

running Flag

Indicates whether MetricsSystem has been started (true) or not (false)

Default: false

sinks

Metrics sinks

Used when MetricsSystem <> and <>.

sources

Metrics sources

Used when MetricsSystem <>.