ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Bessonov (Jira)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-14747) RocksDB research: configuration, lifecycle, basic integration
Date Thu, 03 Jun 2021 15:01:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356500#comment-17356500
] 

Ivan Bessonov commented on IGNITE-14747:
----------------------------------------

Some research results:
 * RocksDB is pretty easy to integrate. It allows you, among other features, to store arbitrary
data sorted, iterate through it and snapshot the state until next restart.
 * Every DB instance can have multiple "column families" - I view them as partitions and possibly
"index.bin" candidates. There's a support for multi-column-family batch-writes, which is good
for SQL indexes. There's also "dropColumnFamilies" to evict multiple partitions at once.

 ** Here we have a potential issue - evicted partitions will still be present in LSM tree
until it's fully compacted. That'll take some time, meaning that we will store too much data
sometimes on top of duplicated entries in LSM tree.
 * Every instance has its own WAL. We should consider disabling it, because it will be replaced
with rebalancing from RAFT log.
 * For the first implementation we could create new RocksDB instance for every table.
 ** Cons: hard to configure memory consumption. As far as I know, we can't force several RockDB
instances to use shared memory restrictions.
 ** Pros: better reads performance. Every cache tree is separate and hence much smaller, giving
you less lookups in general.
 * Usage of the RocksDB for RAFT log. From what I understand, log is basically a cache "long
-> value" with auto-incrementing key and extremely rare update operations, almost append-only.
This approach may not be very optimal for very simple reason: layer files merging is effectively
equal to concatenation, but there's no way to tell it to the engine. This will lead to excessive
IO when we don't need it.
 * Lifecycle - not much to say here. We should start it before starting caches and stop after
stopping caches. There should be explicit way to tell partition number to API or something,
these details will be decided later.

> RocksDB research: configuration, lifecycle, basic integration
> -------------------------------------------------------------
>
>                 Key: IGNITE-14747
>                 URL: https://issues.apache.org/jira/browse/IGNITE-14747
>             Project: Ignite
>          Issue Type: New Feature
>            Reporter: Sergey Chugunov
>            Assignee: Ivan Bessonov
>            Priority: Major
>              Labels: iep-74, ignite-3
>             Fix For: 3.0
>
>
> In accordance with [IEP-74|https://cwiki.apache.org/confluence/display/IGNITE/IEP-74+Data+Storage]
first implementation of persistent Storage will be based on RocksDB K-V storage.
> Thus research is needed on how to integrate it into ignite-3 realm. The following questions
should be covered:
> # What additional configuration properties are needed.
> # How to reconcile lifecycle of RocksDB instance with Ignite node lifecycle.
> # How RocksDB abstractions (e.g. partitions) match with Ignite abstractions.
> Also scope of tasks to implement basic Storage API over RocksDB should be defined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message