Skip to content

RocksDBConf

RocksDBConf is the configuration options for optimizing RocksDB.

Configuration Options

The configuration options belong to spark.sql.streaming.stateStore.rocksdb namespace.

blockCacheSizeMB

spark.sql.streaming.stateStore.rocksdb.blockCacheSizeMB

Default: 8

blockSizeKB

spark.sql.streaming.stateStore.rocksdb.blockSizeKB

Block size (in kB) (that RocksDB sets on a BlockBasedTableConfig)

Default: 4

compactOnCommit

spark.sql.streaming.stateStore.rocksdb.compactOnCommit

Whether to compact RocksDB data while committing state changes (before checkpointing)

Default: false

formatVersion

spark.sql.streaming.stateStore.rocksdb.formatVersion

Default: 5

lockAcquireTimeoutMs

spark.sql.streaming.stateStore.rocksdb.lockAcquireTimeoutMs

Default: 60000

minVersionsToRetain

minVersionsToRetain

resetStatsOnLoad

spark.sql.streaming.stateStore.rocksdb.resetStatsOnLoad

Default: true

trackTotalNumberOfRows

spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows

Enables tracking the total number of rows

Default: true

Adds additional lookups on write operations (put and remove) to track the changes of total number of rows, which can help observability on state store.

Possible Performance Degradation

The additional lookups bring non-trivial overhead on write-heavy workloads. If your query does lots of writes on state, it would be encouraged to turn off the config and turn on again when you really need the know the number for observability/debuggability.

Used when:

Creating RocksDBConf

apply(): RocksDBConf // (1)!
apply(
  storeConf: StateStoreConf): RocksDBConf
  1. Used in tests only

apply requests the given StateStoreConf for the state store configuration options and creates a RocksDBConf.


apply is used when:

  • RocksDBStateStoreProvider is requested for the RocksDB