zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From revans2 <...@git.apache.org>
Subject [GitHub] zookeeper pull request #157: ZOOKEEPER-2678: Discovery and Sync can take a v...
Date Thu, 26 Jan 2017 15:26:05 GMT
GitHub user revans2 opened a pull request:


    ZOOKEEPER-2678: Discovery and Sync can take a very long time on large DB

    This patch addresses recovery time when a leader is lost on a large DB.  
    It does this by not clearing the DB before leader election begins, and by avoiding taking
a snapshot as part of the SYNC phase, specifically for a DIFF sync. It does this by buffering
the proposals and commits just like the code currently does for proposals/commits sent after
the NEWLEADER and before the UPTODATE messages. 
    If a SNAP is sent we cannot avoid writing out the full snapshot because there is no other
way to make sure the disk DB is in sync with what is in memory.  So any edits to the edit
log before a background snapshot happened could possibly be applied on top of an incorrect
    This same optimization should work for TRUNC too, but I opted not to do it for TRUNC because
TRUNC is rare and TRUNC by its very nature already forces the DB to be reread after the edit
logs are modified.  So it would still not be fast.
    In practice this makes it so instead of taking 5+ mins for the cluster to recover from
losing a leader it now takes about 3 seconds.
    I am happy to port this to 3.5. if it looks good.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/revans2/zookeeper ZOOKEEPER-2678

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #157
commit 5aa25620e0189b28d7040305272be2fda28126fb
Author: Robert (Bobby) Evans <evans@yahoo-inc.com>
Date:   2017-01-19T19:50:32Z

    ZOOKEEPER-2678: Discovery and Sync can take a very long time on large DBs


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message