Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9369918503 for ; Fri, 12 Jun 2015 16:50:33 +0000 (UTC) Received: (qmail 7757 invoked by uid 500); 12 Jun 2015 16:50:32 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 7709 invoked by uid 500); 12 Jun 2015 16:50:32 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 7696 invoked by uid 99); 12 Jun 2015 16:50:32 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jun 2015 16:50:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 1CE0A1A5522 for ; Fri, 12 Jun 2015 16:50:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 5.002 X-Spam-Level: ***** X-Spam-Status: No, score=5.002 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, KAM_BADIPHTTP=2, NORMAL_HTTP_TO_IP=0.001, WEIRD_PORT=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id q2Ki4I47com8 for ; Fri, 12 Jun 2015 16:50:16 +0000 (UTC) Received: from mail-oi0-f46.google.com (mail-oi0-f46.google.com [209.85.218.46]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 6829825DB7 for ; Fri, 12 Jun 2015 16:50:13 +0000 (UTC) Received: by oiha141 with SMTP id a141so25213600oih.0 for ; Fri, 12 Jun 2015 09:50:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=iIsM45gI5GTPfO+vmX3EsJ2rab6ikQkDHmMW1mEACao=; b=EjC3gQejGJO58Gav/FqMugTxulSnLGq6+3iUvLvtx7VwnrOoDOrtU6a82JX7EJAqD1 49pDh5VADjnue8/HU8giGBsFBXoLPZwPj5pnKgfb7EDYxv9rk4t0k6jHPgeQVDM3IBdE xs0Vr5rG9ljhkR9nF4/sSh5Ib/j4cB9r2YJDOKN6LQtKVsJ+VsvE+E2qbQx/PaSZ3wzS Qm4kbl68gJlbgc04ZAH8v/gTE/O5e0GsMW8xYuSx3r5aBj1V6YuLS0UkcPFszD1mh37U JhWfkQldhBE8sbJGMYZRFir4Zc5ZMrmhfIYpBK0dxQxd2yfCt++XgkKD2QKl1CMoVu1X yvKw== X-Gm-Message-State: ALoCoQmkVHauJZladR0L2v7XF7wGtTrGcl3vFu6Mu18Z4yokgRWcfhPkPMa4rLyq75RD0EZPNQbH MIME-Version: 1.0 X-Received: by 10.182.199.34 with SMTP id jh2mr13137411obc.48.1434127812149; Fri, 12 Jun 2015 09:50:12 -0700 (PDT) Received: by 10.202.172.73 with HTTP; Fri, 12 Jun 2015 09:50:12 -0700 (PDT) X-Originating-IP: [24.6.229.36] In-Reply-To: <1758297252.2101765.1434120408277.JavaMail.yahoo@mail.yahoo.com> References: <201506121730468917544@chinatransinfo.com> <1758297252.2101765.1434120408277.JavaMail.yahoo@mail.yahoo.com> Date: Fri, 12 Jun 2015 09:50:12 -0700 Message-ID: Subject: Re: Exception when following leader From: =?UTF-8?B?UmHDumwgR3V0acOpcnJleiBTZWdhbMOpcw==?= To: "user@zookeeper.apache.org" , Flavio Junqueira Content-Type: multipart/alternative; boundary=e89a8ff1ccd6ce6716051854e627 --e89a8ff1ccd6ce6716051854e627 Content-Type: text/plain; charset=UTF-8 Yeah: " fsync-ing the write ahead log in SyncThread:2 took 13600ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide" looks suspicious. Either too much contention for IO or a big GC pause would cause a timeout writing (or reading) to/from the leader which would then kick the follower out. -rgs On 12 June 2015 at 07:46, Flavio Junqueira wrote: > Do you happen to know what happened on the leader side? The fsync warning > is also a bit odd... > -Flavio > > > On Friday, June 12, 2015 3:30 PM, "gouwei@chinatransinfo.com" < > gouwei@chinatransinfo.com> wrote: > > > > Hi! > I have a question: zookeeper is inexplicable shutdown. Do you know why? > My log information is as follows: > > 2015-06-12 13:03:54,536 [myid:2] - INFO > [NIOWorkerThread-8:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.3:27426 which had sessionid 0x14dada66bde0004 > 2015-06-12 13:03:54,998 [myid:2] - WARN > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@92] - Exception when > following the leader > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) > at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:160) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:88) > at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:991) > 2015-06-12 13:03:55,000 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:MBeanRegistry@119] - Unregister > MBean > [org.apache.ZooKeeperService:name0=ReplicatedServer_id2,name1=replica.2,name2=Follower] > 2015-06-12 13:03:55,001 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@197] - shutdown called > java.lang.Exception: shutdown Follower > at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:197) > at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:995) > 2015-06-12 13:03:55,001 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerZooKeeperServer@161] - > Shutting down > 2015-06-12 13:03:55,001 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@443] - shutting > down > 2015-06-12 13:03:55,002 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerRequestProcessor@135] - > Shutting down > 2015-06-12 13:03:55,003 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:CommitProcessor@357] - Shutting > down > 2015-06-12 13:03:55,003 [myid:2] - INFO > [FollowerRequestProcessor:2:FollowerRequestProcessor@106] - > FollowerRequestProcessor exited loop! > 2015-06-12 13:03:55,003 [myid:2] - INFO > [CommitProcessor:2:CommitProcessor@192] - CommitProcessor exited loop! > 2015-06-12 13:03:55,004 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@476] - > shutdown of request processor complete > 2015-06-12 13:03:55,005 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:MBeanRegistry@119] - Unregister > MBean > [org.apache.ZooKeeperService:name0=ReplicatedServer_id2,name1=replica.2,name2=Follower,name3=InMemoryDataTree] > 2015-06-12 13:03:55,005 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@191] - > Shutting down > 2015-06-12 13:03:57,841 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.2:62539 > 2015-06-12 13:03:57,843 [myid:2] - WARN > [NIOWorkerThread-1:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:03:57,843 [myid:2] - INFO > [NIOWorkerThread-1:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.2:62539 (no session established for client) > 2015-06-12 13:04:01,017 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.2:62542 > 2015-06-12 13:04:01,022 [myid:2] - WARN > [NIOWorkerThread-2:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:01,023 [myid:2] - INFO > [NIOWorkerThread-2:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.2:62542 (no session established for client) > 2015-06-12 13:04:01,121 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.3:27432 > 2015-06-12 13:04:01,122 [myid:2] - WARN > [NIOWorkerThread-3:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:01,122 [myid:2] - INFO > [NIOWorkerThread-3:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.3:27432 (no session established for client) > 2015-06-12 13:04:01,506 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.1:63948 > 2015-06-12 13:04:01,507 [myid:2] - WARN > [NIOWorkerThread-4:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:01,507 [myid:2] - INFO > [NIOWorkerThread-4:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.1:63948 (no session established for client) > 2015-06-12 13:04:04,137 [myid:2] - WARN [SyncThread:2:FileTxnLog@335] - > fsync-ing the write ahead log in SyncThread:2 took 13600ms which will > adversely effect operation latency. See the ZooKeeper troubleshooting guide > 2015-06-12 13:04:04,138 [myid:2] - WARN > [SyncThread:2:SendAckRequestProcessor@64] - Closing connection to leader, > exception during packet send > java.net.SocketException: Socket closed > at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116) > at java.net.SocketOutputStream.write(SocketOutputStream.java:153) > at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:146) > at > org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:62) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:186) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) > 2015-06-12 13:04:04,139 [myid:2] - INFO > [SyncThread:2:SyncRequestProcessor@169] - SyncRequestProcessor exited! > 2015-06-12 13:04:04,139 [myid:2] - WARN > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@1038] - PeerState set > to LOOKING > 2015-06-12 13:04:04,139 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@913] - LOOKING > 2015-06-12 13:04:04,664 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.3:27435 > 2015-06-12 13:04:04,665 [myid:2] - WARN > [NIOWorkerThread-6:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:04,666 [myid:2] - INFO > [NIOWorkerThread-6:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.3:27435 (no session established for client) > 2015-06-12 13:04:04,707 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.2:62545 > 2015-06-12 13:04:04,708 [myid:2] - WARN > [NIOWorkerThread-5:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:04,708 [myid:2] - INFO > [NIOWorkerThread-5:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.2:62545 (no session established for client) > 2015-06-12 13:04:04,818 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.1:63952 > 2015-06-12 13:04:04,818 [myid:2] - WARN > [NIOWorkerThread-7:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:04,819 [myid:2] - INFO > [NIOWorkerThread-7:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.1:63952 (no session established for client) > 2015-06-12 13:04:05,024 [myid:2] - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@682] - Notification: 2 > (message format version), 3 (n.leader), 0x8000002c9 (n.zxid), 0x2 > (n.round), LOOKING (n.state), 3 (n.sid), 0x8 (n.peerEPoch), LOOKING (my > state)100000000 (n.config version) > 2015-06-12 13:04:05,288 [myid:2] - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@682] - Notification: 2 > (message format version), 3 (n.leader), 0x8000002c9 (n.zxid), 0x2 > (n.round), LOOKING (n.state), 3 (n.sid), 0x8 (n.peerEPoch), LOOKING (my > state)100000000 (n.config version) > 2015-06-12 13:04:05,690 [myid:2] - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@682] - Notification: 2 > (message format version), 3 (n.leader), 0x8000002c9 (n.zxid), 0x2 > (n.round), LOOKING (n.state), 3 (n.sid), 0x8 (n.peerEPoch), LOOKING (my > state)100000000 (n.config version) > 2015-06-12 13:04:06,459 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.2:62548 > 2015-06-12 13:04:06,460 [myid:2] - WARN > [NIOWorkerThread-8:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:06,460 [myid:2] - INFO > [NIOWorkerThread-8:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.2:62548 (no session established for client) > > Some configuration: > clientPort=2181 > dataDir=/opt/zookeeper > syncLimit=2 > tickTime=2000 > initLimit=5 > server.1=kafka1:2888:3888:participant > server.2=kafka2:2888:3888:participant > server.3=kafka3:2888:3888:participant > > thinks! > > > > > gouwei@chinatransinfo.com > > > > > --e89a8ff1ccd6ce6716051854e627--