Return-Path: X-Original-To: apmail-zookeeper-dev-archive@www.apache.org Delivered-To: apmail-zookeeper-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6F7F964A5 for ; Wed, 6 Jul 2011 02:58:45 +0000 (UTC) Received: (qmail 16509 invoked by uid 500); 6 Jul 2011 02:58:45 -0000 Delivered-To: apmail-zookeeper-dev-archive@zookeeper.apache.org Received: (qmail 16300 invoked by uid 500); 6 Jul 2011 02:58:42 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 16009 invoked by uid 99); 6 Jul 2011 02:58:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jul 2011 02:58:40 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jul 2011 02:58:38 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 9FCF544BA6 for ; Wed, 6 Jul 2011 02:58:16 +0000 (UTC) Date: Wed, 6 Jul 2011 02:58:16 +0000 (UTC) From: "Kurt Young (JIRA)" To: dev@zookeeper.apache.org Message-ID: <1290103517.2510.1309921096651.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (ZOOKEEPER-1118) Inconsistent data after server crashes several times MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org Inconsistent data after server crashes several times ---------------------------------------------------- Key: ZOOKEEPER-1118 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1118 Project: ZooKeeper Issue Type: Bug Components: quorum Affects Versions: 3.3.2 Environment: Redhat RHEL5 Reporter: Kurt Young Priority: Critical I think there is a bug when Follower try to sync data with Leader. Assume there are some operations committed during one server had been crash= ed. When the server restart, it will receive a NEWLEADER packet which inclu= de the last zxid of leader and the server will set its own lastProcessZxid = to the leader's.=20 {code:title=3DFollower.java|borderStyle=3Dsolid} void followLeader() throws InterruptedException { fzk.registerJMX(new FollowerBean(this, zk), self.jmxLocalPeerBean); try { InetSocketAddress addr =3D findLeader(); try { connectToLeader(addr); long newLeaderZxid =3D registerWithLeader(Leader.FOLLOWERINFO);= // get the last zxid from leader //check to see if the leader zxid is lower than ours = = =20 //this should never happen but is just a safety check = = =20 long lastLoggedZxid =3D self.getLastLoggedZxid(); if ((newLeaderZxid >> 32L) < (lastLoggedZxid >> 32L)) { LOG.fatal("Leader epoch " + Long.toHexString(newLeaderZxid = >> 32L) + " is less than our epoch " + Long.toHexString(las= tLoggedZxid >> 32L)); throw new IOException("Error: Epoch of leader is lower"); } syncWithLeader(newLeaderZxid); // set its own lastProcessZxid= to leader's last zxid {code} Then, some COMMIT packets will be received by the server in order to sync t= he data with leader. And then, the leader will send an UPTODATE packet to s= erver to take a snapshot.=20 {code:title=3DFollower.java|borderStyle=3Dsolid} protected void processPacket(QuorumPacket qp) throws IOException{ switch (qp.getType()) { case Leader.PING: ping(qp); break; case Leader.PROPOSAL: TxnHeader hdr =3D new TxnHeader(); BinaryInputArchive ia =3D BinaryInputArchive .getArchive(new ByteArrayInputStream(qp.getData())); Record txn =3D SerializeUtils.deserializeTxn(ia, hdr); if (hdr.getZxid() !=3D lastQueued + 1) { LOG.warn("Got zxid 0x" + Long.toHexString(hdr.getZxid()) + " expected 0x" + Long.toHexString(lastQueued + 1)); } lastQueued =3D hdr.getZxid(); fzk.logRequest(hdr, txn); break; case Leader.COMMIT: fzk.commit(qp.getZxid()); break; case Leader.UPTODATE: fzk.takeSnapshot(); self.cnxnFactory.setZooKeeperServer(fzk); break; case Leader.REVALIDATE: revalidate(qp); break; case Leader.SYNC: fzk.sync(); break; } } {code} Notice the different way the Follower treat the COMMIT and the UPTODATE pac= kets. When receives a COMMIT packet, the follower will give this to a proce= ssor to deal with. But if receives a UPTODATE packet, the follower will tak= e a snapshot immediately. So it is possible that the server will take snaps= hot before it commits all the operations it missed. Then if the server cras= hed again and recovered=EF=BC=8C it will recover its data from the snapshot= , so the date inconsistent with the leader now, but its last zxid is the sa= me.=20 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira