EmbeddedRocksDB Engine
This engine allows integrating ClickHouse with RocksDB.
Creating a Table
Engine parameters:
ttl
- time to live for values. TTL is accepted in seconds. If TTL is 0, regular RocksDB instance is used (without TTL).rocksdb_dir
- path to the directory of an existed RocksDB or the destination path of the created RocksDB. Open the table with the specifiedrocksdb_dir
.read_only
- whenread_only
is set to true, read-only mode is used. For storage with TTL, compaction will not be triggered (neither manual nor automatic), so no expired entries are removed.primary_key_name
– any column name in the column list.primary key
must be specified, it supports only one column in the primary key. The primary key will be serialized in binary as arocksdb key
.- columns other than the primary key will be serialized in binary as
rocksdb
value in corresponding order. - queries with key
equals
orin
filtering will be optimized to multi keys lookup fromrocksdb
.
Engine settings:
optimize_for_bulk_insert
– Table is optimized for bulk insertions (insert pipeline will create SST files and import to rocksdb database instead of writing to memtables); default value:1
.bulk_insert_block_size
- Minimum size of SST files (in term of rows) created by bulk insertion; default value:1048449
.
Example:
Metrics
There is also system.rocksdb
table, that expose rocksdb statistics:
Configuration
You can also change any rocksdb options using config:
By default trivial approximate count optimization is turned off, which might affect the performance count()
queries. To enable this
optimization set up optimize_trivial_approximate_count_query = 1
. Also, this setting affects system.tables
for EmbeddedRocksDB engine,
turn on the settings to see approximate values for total_rows
and total_bytes
.
Supported operations
Inserts
When new rows are inserted into EmbeddedRocksDB
, if the key already exists, the value will be updated, otherwise a new key is created.
Example:
Deletes
Rows can be deleted using DELETE
query or TRUNCATE
.
Updates
Values can be updated using the ALTER TABLE
query. The primary key cannot be updated.
Joins
A special direct
join with EmbeddedRocksDB tables is supported.
This direct join avoids forming a hash table in memory and accesses
the data directly from the EmbeddedRocksDB.
With large joins you may see much lower memory usage with direct joins because the hash table is not created.
To enable direct joins:
When the join_algorithm
is set to direct, hash
, direct joins will be used
when possible, and hash otherwise.