Observation¶
Observation
is used to simplify observing named metrics in batch queries using Dataset.observe.
val observation = Observation("name")
val observed = ds.observe(observation, max($"id").as("max_id"))
observed.count()
val metrics = observation.get
// Observe row count (rows) and highest id (maxid) in the Dataset while writing it
val observation = Observation("my_metrics")
val observed_ds = ds.observe(observation, count(lit(1)).as("rows"), max($"id").as("maxid"))
observed_ds.write.parquet("ds.parquet")
val metrics = observation.get
[SPARK-34806][SQL] Add Observation helper for Dataset.observe
Observation
was added in 3.3.1 (this commit).
Creating Instance¶
Observation
takes the following to be created:
- Name (default: random UUID)
Observation
is created using apply factories.
Creating Observation¶
apply(): Observation
apply(name: String): Observation
apply
creates a Observation.