hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Reed (JIRA)" <j...@apache.org>
Subject [jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing
Date Thu, 19 Nov 2009 23:21:39 GMT

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Benjamin Reed updated ZOOKEEPER-582:

    Status: Open  (was: Patch Available)

looks good mahadev just two things:

1) (minor) in getLastLoggedZxid() you should be useing maxLogZxid instead of calling getLastLoggedZxid()

2) when doing the sanity check with the leaders zxid you should be checking epochs not zxids.
it is possible for a follower to see something later and have to truncate from the same epoch,
put a follower should never see a later epoch.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.1, 3.1.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch,
ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds
in the data directory. unfortunately, in the quorum version of zookeeper updates are logged
using an epoch based on the latest log file in a directory. if there is a snapshot with a
higher epoch than the log files, the zookeeper server will start logging using an epoch one
higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files,
zookeeper will start logging changes using epoch 1. if the cluster restarts the state will
be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it
comes from the snapshot or log, and should include a sanity check to make sure that a follower
never connects to a leader that has a lower epoch than its own.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message