RocksDBConf¶
RocksDBConf is the configuration options for optimizing RocksDB.
Configuration Options¶
The configuration options belong to spark.sql.streaming.stateStore.rocksdb namespace.
blockCacheSizeMB¶
spark.sql.streaming.stateStore.rocksdb.blockCacheSizeMB
Default: 8
blockSizeKB¶
spark.sql.streaming.stateStore.rocksdb.blockSizeKB
Block size (in kB) (that RocksDB sets on a BlockBasedTableConfig)
Default: 4
compactOnCommit¶
spark.sql.streaming.stateStore.rocksdb.compactOnCommit
Whether to compact RocksDB data while committing state changes (before checkpointing)
Default: false
formatVersion¶
spark.sql.streaming.stateStore.rocksdb.formatVersion
Default: 5
lockAcquireTimeoutMs¶
spark.sql.streaming.stateStore.rocksdb.lockAcquireTimeoutMs
Default: 60000
minVersionsToRetain¶
resetStatsOnLoad¶
spark.sql.streaming.stateStore.rocksdb.resetStatsOnLoad
Default: true
trackTotalNumberOfRows¶
spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows
Enables tracking the total number of rows
Default: true
Adds additional lookups on write operations (put and remove) to track the changes of total number of rows, which can help observability on state store.
Possible Performance Degradation
The additional lookups bring non-trivial overhead on write-heavy workloads. If your query does lots of writes on state, it would be encouraged to turn off the config and turn on again when you really need the know the number for observability/debuggability.
Used when:
RocksDBis requested to load a version, put a new value for a key and remove a key
Creating RocksDBConf¶
apply(): RocksDBConf // (1)!
apply(
storeConf: StateStoreConf): RocksDBConf
- Used in tests only
apply requests the given StateStoreConf for the state store configuration options and creates a RocksDBConf.
apply is used when:
RocksDBStateStoreProvideris requested for the RocksDB