Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3E26A17DBF for ; Thu, 24 Sep 2015 12:24:57 +0000 (UTC) Received: (qmail 5584 invoked by uid 500); 24 Sep 2015 12:24:56 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 5534 invoked by uid 500); 24 Sep 2015 12:24:56 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 5520 invoked by uid 99); 24 Sep 2015 12:24:56 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Sep 2015 12:24:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id E4D8F1A7E4F for ; Thu, 24 Sep 2015 12:24:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.021 X-Spam-Level: X-Spam-Status: No, score=-0.021 tagged_above=-999 required=6.31 tests=[RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Gm5dSFKPHKgq for ; Thu, 24 Sep 2015 12:24:54 +0000 (UTC) Received: from nk11p18im-asmtp001.me.com (nk11p18im-asmtp001.me.com [17.158.120.160]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 32116210C8 for ; Thu, 24 Sep 2015 12:24:54 +0000 (UTC) Received: from [192.168.0.5] (ua-83-227-12-104.cust.bredbandsbolaget.se [83.227.12.104]) by nk11p18im-asmtp001.me.com (Oracle Communications Messaging Server 7.0.5.35.0 64bit (built Mar 31 2015)) with ESMTPSA id <0NV6005SOL5D9L40@nk11p18im-asmtp001.me.com> for user@zookeeper.apache.org; Thu, 24 Sep 2015 12:24:53 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2015-09-24_05:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=5.21160892219541e-11 compositescore=0.999905877546144 phishscore=0 kscore.is_spamscore=0 rbsscore=0.999905877546144 recipient_to_sender_totalscore=0 spamscore=0 urlsuspectscore=0.999905877546144 adultscore=0 kscore.compositescore=0 circleOfTrustscore=0 suspectscore=1 recipient_domain_to_sender_totalscore=0 bulkscore=0 recipient_domain_to_sender_domain_totalscore=0 recipient_to_sender_domain_totalscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1412110000 definitions=main-1509240192 From: Akmal Abbasov Content-type: text/plain; charset=utf-8 Content-transfer-encoding: quoted-printable Subject: Unstable work of zookeeper Message-id: <568AD511-F4AD-42EF-A964-DADCB0B0DAAC@icloud.com> Date: Thu, 24 Sep 2015 14:24:50 +0200 To: user@zookeeper.apache.org MIME-version: 1.0 (Mac OS X Mail 8.2 \(2104\)) X-Mailer: Apple Mail (2.2104) Hi, I am using zookeeper 3.4.6 I have a spark cluster configured with HA. Once per 1-2 days, the active = spark master is shutting down with a message 15/09/23 18:58:18 INFO zookeeper.ClientCnxn: Unable to read additional = data from server sessionid 0x34ffa68dbd10021, likely server has closed = socket, closing socket connection and attempting reconnect 15/09/23 18:58:18 INFO state.ConnectionStateManager: State change: = SUSPENDED 15/09/23 18:58:18 INFO master.ZooKeeperLeaderElectionAgent: We have lost = leadership 15/09/23 18:58:18 ERROR master.Master: Leadership has been revoked -- = master shutting down. 15/09/23 18:58:18 INFO util.Utils: Shutdown hook called I don=E2=80=99t have the zookeeper logs from the same period, but the = logs are full of the these messages=20 2015-09-24 05:07:42,228 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - = Accepted socket connection from /10.0.8.4:34705 2015-09-24 05:07:42,229 [myid:1] - WARN = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - = Connection request from old client /10.0.8.4:34705; will be dropped if = server is in r-o mode 2015-09-24 05:07:42,229 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client = attempting to establish new session at /10.0.8.4:34705 2015-09-24 05:07:42,292 [myid:1] - INFO = [CommitProcessor:1:ZooKeeperServer@617] - Established session = 0x14ffd3670130030 with negotiated timeout 20001 for client = /10.0.8.4:34705 2015-09-24 05:07:42,302 [myid:1] - WARN = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught = end of stream exception EndOfStreamException: Unable to read additional data from client = sessionid 0x14ffd3670130030, likely client has closed socket at = org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at = org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.= java:208) at java.lang.Thread.run(Thread.java:745) 2015-09-24 05:07:42,303 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed = socket connection for client /10.0.8.4:34705 which had sessionid = 0x14ffd3670130030 2015-09-24 05:07:42,314 [myid:1] - ERROR = [CommitProcessor:1:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at = sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at = sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) at = org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:15= 1) at = org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:= 1081) at = org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequ= estProcessor.java:404) at = org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.jav= a:74) 2015-09-24 05:07:42,334 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - = Accepted socket connection from /10.0.8.4:34707 2015-09-24 05:07:42,334 [myid:1] - WARN = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - = Connection request from old client /10.0.8.4:34707; will be dropped if = server is in r-o mode 2015-09-24 05:07:42,335 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client = attempting to establish new session at /10.0.8.4:34707 2015-09-24 05:07:42,357 [myid:1] - INFO = [CommitProcessor:1:ZooKeeperServer@617] - Established session = 0x14ffd3670130031 with negotiated timeout 20001 for client = /10.0.8.4:34707 2015-09-24 05:07:42,364 [myid:1] - WARN = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught = end of stream exception EndOfStreamException: Unable to read additional data from client = sessionid 0x14ffd3670130031, likely client has closed socket at = org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at = org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.= java:208) at java.lang.Thread.run(Thread.java:745) 2015-09-24 05:07:42,365 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed = socket connection for client /10.0.8.4:34707 which had sessionid = 0x14ffd3670130031 2015-09-24 05:07:42,376 [myid:1] - ERROR = [CommitProcessor:1:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at = sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at = sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) at = org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:15= 1) at = org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:= 1081) at = org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequ= estProcessor.java:404) at = org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.jav= a:74) Also there are=20 2015-09-24 06:29:54,459 [myid:1] - INFO = [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@139] = - Shutting down 2015-09-24 06:29:54,459 [myid:1] - INFO = [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@441] - = shutting down 2015-09-24 06:29:54,459 [myid:1] - INFO = [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:FollowerRequestProcessor@105] = - Shutting down 2015-09-24 06:29:54,459 [myid:1] - INFO = [FollowerRequestProcessor:1:FollowerRequestProcessor@95] - = FollowerRequestProcessor exited loop! 2015-09-24 06:29:54,460 [myid:1] - INFO = [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:CommitProcessor@181] - = Shutting down 2015-09-24 06:29:54,464 [myid:1] - INFO = [CommitProcessor:1:CommitProcessor@150] - CommitProcessor exited loop! 2015-09-24 06:29:54,465 [myid:1] - INFO = [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@415] - = shutdown of request processor complete 2015-09-24 06:29:54,466 [myid:1] - INFO = [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@209] - = Shutting down 2015-09-24 06:29:54,466 [myid:1] - INFO = [SyncThread:1:SyncRequestProcessor@187] - SyncRequestProcessor exited! 2015-09-24 06:29:54,466 [myid:1] - INFO = [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@714] - LOOKING 2015-09-24 06:29:54,584 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - = Accepted socket connection from /10.0.8.58:36137 2015-09-24 06:29:54,584 [myid:1] - WARN = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - = Exception causing close of session 0x0 due to java.io.IOException: = ZooKeeperServer not running 2015-09-24 06:29:54,584 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed = socket connection for client /10.0.8.58:36137 (no session established = for client) 2015-09-24 06:29:54,679 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - = Accepted socket connection from /10.0.8.57:57410 2015-09-24 06:29:54,680 [myid:1] - WARN = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - = Exception causing close of session 0x0 due to java.io.IOException: = ZooKeeperServer not running 2015-09-24 06:29:54,680 [myid:1] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed = socket connection for client /10.0.8.57:57410 (no session established = for client) I also observed that hadoop-zkfc restarts very frequently. Any ideas what could be wrong? Thanks.=