ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Вячеслав Коптилин <slava.kopti...@gmail.com>
Subject Re: Read Repair (ex. Consistency Check) - review request #2
Date Fri, 28 Jun 2019 11:01:02 GMT
Hi Anton,

I will take a look at your pull request if you don't mind.

In any way, could you please update the IEP page with the list of
constraints/limitations of the proposed approach, TODOs, etc?
For instance, I would like to see all these limitations on the IEP page as
JIRA tickets. Perhaps, it would be good to create an epic/umbrella ticket
in order to track all activities related to `Read Repair` feature.

Thanks,
S.

чт, 20 июн. 2019 г. в 14:15, Anton Vinogradov <av@apache.org>:

> Igniters,
> I'm glad to introduce Read Repair feature [0] provides additional
> consistency guarantee for Ignite.
>
> 1) Why we need it?
> The detailed explanation can be found at IEP-31 [1].
> In short, because of bugs, it's possible to gain an inconsistent state.
> We need additional features to handle this case.
>
> Currently we able to check cluster using Idle_verify [2] feature, but it
> will not fix the data, will not even tell which entries are broken.
> Read Repair is a feature to understand which entries are broken and to fix
> them.
>
> 1) How it works?
> IgniteCache now able to provide special proxy [3] withReadRepair().
> This proxy guarantee that data will be gained from all owners and compared.
> In the case of consistency violation situation, data will be recovered and
> a special event recorded.
>
> 3) Naming?
> Feature name based on Cassandra's Read Repair feature [4], which is pretty
> similar.
>
> 4) Limitations which can be fixed in the future?
>   * MVCC and Near caches are not supported.
>   * Atomic caches can be checked (false positive case is possible on this
> check), but can't be recovered.
>   * Partial entry removal can't be recovered.
>   * Entries streamed using data streamer (using not a "cache.put" based
> updater) and loaded by cache.load
>   are perceived as inconsistent since they may have different versions for
> same keys.
>   * Only explicit get operations are supported (getAndReplace, getAndPut,
> etc can be supported in future).
>
> 5) What's left?
>   * SQL/ThinClient/etc support.
>   * Metrics (found/repaired).
>   * Simple per-partition recovery feature able to work in the background in
> addition to per-entry recovery feature.
>
> 6) Is code checked?
>   * Pull Request #5656 [5] (feature) - has green TC.
>   * Pull Request #6575 [6] (RunAll with the feature enabled for every get()
> request) - has a limited amount of failures (because of data streamer,
> cache.load, etc).
>
> Thoughts?
>
> [0] https://issues.apache.org/jira/browse/IGNITE-10663
> [1]
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-31+Consistency+check+and+fix
> [2]
>
> https://apacheignite-tools.readme.io/docs/control-script#section-verification-of-partition-checksums
> [3]
>
> https://github.com/apache/ignite/blob/27b6105ecc175b61e0aef59887830588dfc388ef/modules/core/src/main/java/org/apache/ignite/IgniteCache.java#L140
> [4]
>
> https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/operations/opsRepairNodesReadRepair.html
> [5] https://github.com/apache/ignite/pull/5656
> [6] https://github.com/apache/ignite/pull/6575
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message