RocksDBConf¶
RocksDBConf
is the configuration options for optimizing RocksDB.
Configuration Options¶
The configuration options belong to spark.sql.streaming.stateStore.rocksdb
namespace.
blockCacheSizeMB¶
spark.sql.streaming.stateStore.rocksdb.blockCacheSizeMB
Default: 8
blockSizeKB¶
spark.sql.streaming.stateStore.rocksdb.blockSizeKB
Block size (in kB) (that RocksDB
sets on a BlockBasedTableConfig)
Default: 4
compactOnCommit¶
spark.sql.streaming.stateStore.rocksdb.compactOnCommit
Whether to compact RocksDB data while committing state changes (before checkpointing)
Default: false
formatVersion¶
spark.sql.streaming.stateStore.rocksdb.formatVersion
Default: 5
lockAcquireTimeoutMs¶
spark.sql.streaming.stateStore.rocksdb.lockAcquireTimeoutMs
Default: 60000
minVersionsToRetain¶
resetStatsOnLoad¶
spark.sql.streaming.stateStore.rocksdb.resetStatsOnLoad
Default: true
trackTotalNumberOfRows¶
spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows
Enables tracking the total number of rows
Default: true
Adds additional lookups on write operations (put and remove) to track the changes of total number of rows, which can help observability on state store.
Possible Performance Degradation
The additional lookups bring non-trivial overhead on write-heavy workloads. If your query does lots of writes on state, it would be encouraged to turn off the config and turn on again when you really need the know the number for observability/debuggability.
Used when:
RocksDB
is requested to load a version, put a new value for a key and remove a key
Creating RocksDBConf¶
apply(): RocksDBConf // (1)!
apply(
storeConf: StateStoreConf): RocksDBConf
- Used in tests only
apply
requests the given StateStoreConf for the state store configuration options and creates a RocksDBConf
.
apply
is used when:
RocksDBStateStoreProvider
is requested for the RocksDB