zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kurt Young (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ZOOKEEPER-1118) Inconsistent data after server crashes several times
Date Wed, 06 Jul 2011 02:58:16 GMT
Inconsistent data after server crashes several times
----------------------------------------------------

                 Key: ZOOKEEPER-1118
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1118
             Project: ZooKeeper
          Issue Type: Bug
          Components: quorum
    Affects Versions: 3.3.2
         Environment: Redhat RHEL5
            Reporter: Kurt Young
            Priority: Critical


I think there is a bug when Follower try to sync data with Leader.
Assume there are some operations committed during one server had been crashed. When the server
restart, it will receive a NEWLEADER packet which include the last zxid of leader and the
server will set its own lastProcessZxid to the leader's. 
{code:title=Follower.java|borderStyle=solid}
void followLeader() throws InterruptedException {
    fzk.registerJMX(new FollowerBean(this, zk), self.jmxLocalPeerBean);
    try {
        InetSocketAddress addr = findLeader();
        try {
            connectToLeader(addr);
            long newLeaderZxid = registerWithLeader(Leader.FOLLOWERINFO);  // get the last
zxid from leader
            //check to see if the leader zxid is lower than ours                         
                                                                
            //this should never happen but is just a safety check                        
                                                                
            long lastLoggedZxid = self.getLastLoggedZxid();
            if ((newLeaderZxid >> 32L) < (lastLoggedZxid >> 32L)) {
                LOG.fatal("Leader epoch " + Long.toHexString(newLeaderZxid >> 32L)
                        + " is less than our epoch " + Long.toHexString(lastLoggedZxid >>
32L));
                throw new IOException("Error: Epoch of leader is lower");
            }
            syncWithLeader(newLeaderZxid);   // set its own lastProcessZxid to leader's last
zxid
{code}

Then, some COMMIT packets will be received by the server in order to sync the data with leader.
And then, the leader will send an UPTODATE packet to server to take a snapshot. 
{code:title=Follower.java|borderStyle=solid}
protected void processPacket(QuorumPacket qp) throws IOException{
    switch (qp.getType()) {
    case Leader.PING:
        ping(qp);
        break;
    case Leader.PROPOSAL:
        TxnHeader hdr = new TxnHeader();
        BinaryInputArchive ia = BinaryInputArchive
        .getArchive(new ByteArrayInputStream(qp.getData()));
        Record txn = SerializeUtils.deserializeTxn(ia, hdr);
        if (hdr.getZxid() != lastQueued + 1) {
            LOG.warn("Got zxid 0x"
                    + Long.toHexString(hdr.getZxid())
                    + " expected 0x"
                    + Long.toHexString(lastQueued + 1));
        }
        lastQueued = hdr.getZxid();
        fzk.logRequest(hdr, txn);
        break;
    case Leader.COMMIT:
        fzk.commit(qp.getZxid());
        break;
    case Leader.UPTODATE:
        fzk.takeSnapshot();
        self.cnxnFactory.setZooKeeperServer(fzk);
        break;
    case Leader.REVALIDATE:
        revalidate(qp);
        break;
    case Leader.SYNC:
        fzk.sync();
        break;
    }
}
{code}
Notice the different way the Follower treat the COMMIT and the UPTODATE packets. When receives
a COMMIT packet, the follower will give this to a processor to deal with. But if receives
a UPTODATE packet, the follower will take a snapshot immediately. So it is possible that the
server will take snapshot before it commits all the operations it missed. Then if the server
crashed again and recovered´╝î it will recover its data from the snapshot, so the date inconsistent
with the leader now, but its last zxid is the same. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message