Skip to content

RocksDB State Store

RocksDB can be used as a state store backend in Spark Structured Streaming.

RocksDB's Notable Features

RocksDB is an embeddable persistent key-value store with the following features:

  • Uses a log structured database engine
  • Keys and values are arbitrarily-sized byte streams
  • Optimized for fast, low latency storage (flash drives and high-speed disk drives) for high read/write rates

The full documentation is currently on the GitHub wiki.

stateStore.providerClass

spark.sql.streaming.stateStore.providerClass with org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider enables RocksDBStateStoreProvider as the default StateStoreProvider.

Logging

RocksDB is used to create a native logger and configure a logging level accordingly.

Demo