Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 11AAE200BA0 for ; Fri, 14 Oct 2016 22:40:40 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 103A8160ADD; Fri, 14 Oct 2016 20:40:40 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2F1A4160AD3 for ; Fri, 14 Oct 2016 22:40:39 +0200 (CEST) Received: (qmail 58229 invoked by uid 500); 14 Oct 2016 20:40:37 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 58217 invoked by uid 99); 14 Oct 2016 20:40:37 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Oct 2016 20:40:37 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 1BEBAC2044 for ; Fri, 14 Oct 2016 20:40:37 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.981 X-Spam-Level: *** X-Spam-Status: No, score=3.981 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, HTML_OBFUSCATE_05_10=0.001, KAM_BADIPHTTP=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=avinetworks-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id qTo7BdYL3kl0 for ; Fri, 14 Oct 2016 20:40:34 +0000 (UTC) Received: from mail-oi0-f45.google.com (mail-oi0-f45.google.com [209.85.218.45]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 915225F119 for ; Fri, 14 Oct 2016 20:40:33 +0000 (UTC) Received: by mail-oi0-f45.google.com with SMTP id y2so150960567oie.0 for ; Fri, 14 Oct 2016 13:40:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=avinetworks-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=5Ylte9VrXRQM0iaNhwvN4EV8bvhgKIgdAnjuVdzuk+k=; b=lJyzoktc+FFJNU/+Ijo6+l1lEr3Hz/9WOxF5pJrEl1Oeg8EMyv6g7BIQ93gy9FZ+EY nrfkvtTBQJkEVQdziX9X4aVe20XQnAl07KZ7AMrAUcXCfRQgwnFR8sQcT7SsJWavuxnU MV7eRGt12S5uATDIdj1QOpLvmRDWvwF/2/6KTu776L35kOObbrUDfLzkK6wASaisJYHd jsHfLX76Y1hEi6U9NpPQAmIasSLaTEBPTCZM3ZNim0IuCt/dhfzgvBRoTY5RwAfVOo3g airg4CjAWQUTmEofNwJVr7mlcUaZziI4r0I1cSFLZxewkqVqpRvSQXtzlWErBwEaLSFe WaCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=5Ylte9VrXRQM0iaNhwvN4EV8bvhgKIgdAnjuVdzuk+k=; b=d+EX1M2USp09Yv54Eg1E1aB8/B8bJtYpH0hlROTYDoO+zV16JxueMYzSSN45fWUEWS VTzN4dKGGMTK7QJ4a1jH+NuRfHFJrrpq1CminDWgxGOsxEelSfyD7gEw0bTB1u9Lw9DU lYIeMXbTVZtpbsYuJ4AN9J4ZSaQLC2IC2xx2sFVVZfnE9T8MlP1+05+mFX68dhj7sxze vDWbBV5Yjvyh9tm499rDdiUM/hzAl7CMOT7phFZpkR6DY/y6Zs3MiZ9A9gbdfqjsijHx 7YUrOY9PpQb/pyftFMbAuoTPJHKZlT0Cjo6RyUHvZxgrO6cSVVRMhgUPmnkA5a/Xe0Fu z8eQ== X-Gm-Message-State: AA6/9RnYay876hOjQ3P1nl+C9bzQxh7djEeW/+aNv0swfxAmHTsMkfB29OGTsXo4WSoxfFdw04I3Y3ImwDDCgQ== X-Received: by 10.157.2.101 with SMTP id 92mr6557113otb.227.1476477624987; Fri, 14 Oct 2016 13:40:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.52.194 with HTTP; Fri, 14 Oct 2016 13:40:24 -0700 (PDT) In-Reply-To: References: <1A9BB7DA-9D76-40CF-92C8-743A3B418743@apache.org> From: Anand Parthasarathy Date: Fri, 14 Oct 2016 13:40:24 -0700 Message-ID: Subject: Re: Zookeeper leader election takes a long time. To: Michael Han Cc: UserZooKeeper Content-Type: multipart/alternative; boundary=94eb2c03918c5b4ee9053ed93c69 archived-at: Fri, 14 Oct 2016 20:40:40 -0000 --94eb2c03918c5b4ee9053ed93c69 Content-Type: text/plain; charset=UTF-8 Hi Michael and Ryan, Thanks for looking into the logs. Actually, in this case, node 2 was brought down ungracefully and was down permanently. In this customer deployment, our product runs as a docker container in a bare metal environment. It is possible that the docker proxy that provides external connectivity to the zookeeper ports is down but the container itself is up and running for several more minutes. If you grep for "2 (n.sid)" you will see that there is no packets after 12:46. Convergence after that did not happen until 15:57 or so. Thanks, Anand. On Thu, Oct 13, 2016 at 5:10 PM, Michael Han wrote: > >> it started a new round but then I seem to see the election messages > from Node 2 again. Any idea why? > > My guess is node 2 is back online and ZK service was started. In this use > case node 2 does not get offline permanently, IIUC. > > > On Thu, Oct 13, 2016 at 3:41 PM, Ryan Zhang > wrote: > >> Hi, Anand, I took a look and I wonder how do you explain the following >> >> The N1.log starts at around >> 2016-10-03 12:37:38,073 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1: >> 5002:QuorumPeer@714] - LOOKING >> >> and it failed to talk to Node 2 (id: 2) >> 2016-10-03 12:37:38,136 [myid:1] - WARN [WorkerSender[myid=1]:QuorumCn >> xManager@382] - Cannot open channel to 2 at election address >> node2.controller.local/127.0.0.8:5004 >> java.net.ConnectException: Connection refused >> at java.net.PlainSocketImpl.socketConnect(Native Method) >> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSock >> etImpl.java:339) >> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPl >> ainSocketImpl.java:200) >> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocket >> Impl.java:182) >> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) >> at java.net.Socket.connect(Socket.java:579) >> >> However, after failed to get any election messages from Node 3, it >> started a new round but then I seem to see the election messages from Node >> 2 again. Any idea why? >> >> 2016-10-03 12:38:32,815 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1: >> 5002:QuorumPeer@714] - LOOKING >> 2016-10-03 12:38:32,816 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1: >> 5002:FileSnap@83] - Reading snapshot /var/lib/zookeeper/version-2/s >> napshot.20002aa0b >> 2016-10-03 12:38:32,875 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1: >> 5002:FastLeaderElection@815] - New election. My id = 1, proposed >> zxid=0x20002bfb5 >> 2016-10-03 12:38:32,876 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 1 >> (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round), LOOKING (n.state), 1 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:32,878 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 2 >> (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round), LOOKING (n.state), 2 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:32,878 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 2 >> (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round), LOOKING (n.state), 1 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:32,917 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 3 >> (n.leader), 0x20002bfb5 (n.zxid), 0x4 (n.round), LEADING (n.state), 3 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:33,117 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1: >> 5002:FastLeaderElection@849] - Notification time out: 400 >> 2016-10-03 12:38:33,118 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 2 >> (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round), LOOKING (n.state), 1 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:33,159 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 3 >> (n.leader), 0x20002bfb5 (n.zxid), 0x4 (n.round), LEADING (n.state), 3 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:33,559 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1: >> 5002:FastLeaderElection@849] - Notification time out: 800 >> 2016-10-03 12:38:33,560 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 2 >> (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round), LOOKING (n.state), 1 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:33,561 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 2 >> (n.leader), 0x20002bfb5 (n.zxid), 0x6 (n.round), LEADING (n.state), 2 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:33,601 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLe >> aderElection@597] - Notification: 1 (message format version), 3 >> (n.leader), 0x20002bfb5 (n.zxid), 0x4 (n.round), LEADING (n.state), 3 >> (n.sid), 0x2 (n.peerEpoch) LOOKING (my state) >> 2016-10-03 12:38:34,402 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1: >> 5002:FastLeaderElection@849] - Notification time out: 1600 >> >> >> >> On Oct 8, 2016, at 10:05 AM, Anand Parthasarathy < >> anpartha@avinetworks.com> wrote: >> >> Hi Flavio, >> >> I have attached the logs from node 1 and node 3. Node 2 was powered off >> around 10-03 12:36. Leader election kept going until 10-03 15:57:16 when it >> finally converged. >> >> Thanks, >> Anand. >> >> On Sat, Oct 8, 2016 at 7:55 AM, Flavio Junqueira > fpj@apache.org>> wrote: >> Hi Anand, >> >> I don't understand whether 1 and 3 were able or even trying to connect to >> each other. They should be able to elect a leader between them and make >> progress. You might want to upload logs and let us know. >> >> -Flavio >> >> > On 08 Oct 2016, at 02:11, Anand Parthasarathy > > wrote: >> > >> > Hi, >> > >> > We are currently using zookeeper 3.4.6 version and use a 3 node >> solution in >> > our system. We see that occasionally, when a node is powered off (in >> this >> > instance, it was actually a leader node), the remaining two nodes do not >> > form a quorum for a really long time. Looking at the logs, it appears >> the >> > sequence is as follows: >> > - Node 2 is the zookeeper leader >> > - Node 2 is powered off >> > - Node 1 and Node 3 recognize and start the election >> > - Node 3 times out after initLimit * tickTime with "Timeout while >> waiting >> > for quorum" for Round N >> > - Node 1 times out after initLimit * tickTime with "Exception while >> trying >> > to follow leader" for Round N+1 at the same time. >> > - And the process continues where N is sequentially incrementing. >> > - This happens for a long time. >> > - In one instance, we used tickTime=5000 and initLimit=20 and it took >> > around 3.5 hours to converge. >> > - In a given round, Node 1 will try connecting to Node 2, gets >> connection >> > refused waits for notification timeout which increases by 2 every >> iteration >> > until it hits the initLimit. Connection Refused is because the node 2 >> comes >> > up after reboot, but zookeeper process is not started (due to a >> different >> > failure). >> > >> > It looks similar to ZOOKEEPER-2164 but there it is a connection timeout >> > where Node 2 is not reachable. >> > >> > Could you pls. share if you have seen this issue and if so, what is the >> > workaround that can be employed in 3.4.6. >> > >> > Thanks, >> > Anand. >> >> >> >> >> > > > -- > Cheers > Michael. > --94eb2c03918c5b4ee9053ed93c69--