Return-Path: Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: (qmail 11703 invoked from network); 22 Mar 2011 07:12:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Mar 2011 07:12:29 -0000 Received: (qmail 81059 invoked by uid 500); 22 Mar 2011 07:12:28 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 80955 invoked by uid 500); 22 Mar 2011 07:12:24 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 80938 invoked by uid 99); 22 Mar 2011 07:12:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Mar 2011 07:12:23 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: unknown (nike.apache.org: error in processing during lookup of strib@nicira.com) Received: from [209.85.161.170] (HELO mail-gx0-f170.google.com) (209.85.161.170) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Mar 2011 07:12:13 +0000 Received: by gxk27 with SMTP id 27so4103393gxk.15 for ; Tue, 22 Mar 2011 00:11:52 -0700 (PDT) Received: by 10.236.78.74 with SMTP id f50mr6673664yhe.128.1300777912413; Tue, 22 Mar 2011 00:11:52 -0700 (PDT) Received: from [192.168.1.2] (204-195-74-139.wavecable.com [204.195.74.139]) by mx.google.com with ESMTPS id a12sm3428990yhk.27.2011.03.22.00.11.50 (version=SSLv3 cipher=OTHER); Tue, 22 Mar 2011 00:11:51 -0700 (PDT) Message-ID: <4D884BB5.6010803@nicira.com> Date: Tue, 22 Mar 2011 00:11:49 -0700 From: Jeremy Stribling User-Agent: Thunderbird 2.0.0.24 (X11/20100623) MIME-Version: 1.0 To: user@zookeeper.apache.org CC: dev@zookeeper.apache.org Subject: Re: new leader accepting create requests too early? References: <4D880657.2040005@nicira.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the response. I thought that the "my state" in this line, printed in node #3's log: > > 2672 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 215 (n.leader), 12884902548 (n.zxid), 3 (n.round), FOLLOWING > (n.state), 126 (n.sid), LEADING (my state) indicated that node #3 was the leader, but I'm probably misinterpreting it (I haven't had a chance to look through the source yet to figure it out for sure). In any case, what I think are the relevant notifications of node #1's logs look like this: > > 307122 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 37 (n.leader), 17179869831 (n.zxid), 1 (n.round), LOOKING (n.state), > 37 (n.sid), LEADING (my state) > 307142 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LOOKING (n.state), > 37 (n.sid), LEADING (my state) > 310850 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), > 215 (n.sid), LOOKING (my state) > 310850 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), > 37 (n.sid), LOOKING (my state) > 311051 [QuorumPeer:/0.0.0.0:2888] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification > time out: 400 > 311053 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), > 37 (n.sid), LOOKING (my state) > 311054 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), > 215 (n.sid), LOOKING (my state) > 311454 [QuorumPeer:/0.0.0.0:2888] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification > time out: 800 > 311456 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), > 37 (n.sid), LOOKING (my state) > 311457 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), > 215 (n.sid), LOOKING (my state) > 312257 [QuorumPeer:/0.0.0.0:2888] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification > time out: 1600 > 312260 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 215 (n.leader), 17179869918 (n.zxid), 4 (n.round), LOOKING (n.state), > 215 (n.sid), LOOKING (my state) > 312263 [WorkerReceiver Thread] INFO > org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: > 37 (n.leader), 17179869831 (n.zxid), 3 (n.round), LEADING (n.state), > 37 (n.sid), LOOKING (my state) which, according to my earlier logic, seems to indicate that node #1 never even thought it was following node #3. Anyway, I will put the logs together and make a JIRA tomorrow if I get some time, and will follow up here with a link. Thanks again, Jeremy