zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fangmin Lv (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ZOOKEEPER-3114) Built-in data consistency check inside ZooKeeper
Date Wed, 19 Sep 2018 03:14:00 GMT

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Fangmin Lv updated ZOOKEEPER-3114:
    Summary: Built-in data consistency check inside ZooKeeper  (was: Real time consistency
check between leader and learner)

> Built-in data consistency check inside ZooKeeper
> ------------------------------------------------
>                 Key: ZOOKEEPER-3114
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3114
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: quorum
>            Reporter: Fangmin Lv
>            Assignee: Fangmin Lv
>            Priority: Major
>             Fix For: 3.6.0
> The correctness of ZooKeeper was kind of proved in theory in ZAB paper, but the implementation
is a bit different from the paper, for example, save the currentEpoch and proposals/commits
upon to NEWLEADER is not atomic in current implementation, so the correctness of ZooKeeper
is not actually proved in reality.
> Also bugs could be introduced during implementation, issues like sending NEWLEADER packet
too early reported in ZOOKEEPER-3104 might be there since the beginning (didn't check exactly
when this was introduced). 
> More correctness issues were introduced when adding new features, like on disk txn sync,
local session, retain database, etc, both of these features added inconsistency bugs on
> To catch the consistency issue earlier, internally, we're running external consistency
checkers to compare nodes (digest), but that's not efficient (slow and expensive) and there
are corner cases we cannot cover in external checker. For example, we don't know the last
zxid before epoch change, which makes it's impossible to check it's missing txn or not. Another challenge is
the false negative which is hard to avoid due to fuzzy snapshot or expected txn gap during
snapshot syncing, etc.
> This Jira is going to propose a built-in real time consistency check by calculating
the digest of DataTree after applying each txn, and sending it over to learner during propose
time so that it can verify the correctness in real time. 
> The consistency check will cover all phases, including loading time during startup, syncing,
and broadcasting. It can help us avoid data lost or data corrupt due to bad disk and catch
bugs in code.
> The protocol change will make backward compatible to make sure we can enable/disable
this feature transparently.
> As for performance impact, based on our testing, it will add a bit overhead during runtime,
but doesn't have obvious impact in general.
> h2.  

This message was sent by Atlassian JIRA

View raw message