Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3E8CA17ED3 for ; Fri, 28 Aug 2015 00:34:43 +0000 (UTC) Received: (qmail 28571 invoked by uid 500); 28 Aug 2015 00:34:42 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 28519 invoked by uid 500); 28 Aug 2015 00:34:42 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 28500 invoked by uid 99); 28 Aug 2015 00:34:42 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Aug 2015 00:34:42 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9E2CCC0E0F for ; Fri, 28 Aug 2015 00:34:41 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.11 X-Spam-Level: X-Spam-Status: No, score=0.11 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_DKIM_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=fail (1024-bit key) reason="fail (bad RSA signature)" header.d=airtime.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id XbXAAQjECDKZ for ; Fri, 28 Aug 2015 00:34:32 +0000 (UTC) Received: from mail-pa0-f53.google.com (mail-pa0-f53.google.com [209.85.220.53]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 4005242B9A for ; Fri, 28 Aug 2015 00:34:32 +0000 (UTC) Received: by pabzx8 with SMTP id zx8so42500700pab.1 for ; Thu, 27 Aug 2015 17:34:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=airtime.com; s=airtime; h=from:content-type:content-transfer-encoding:subject:message-id:date :to:mime-version; bh=tEABGwfEEKdfVk5WoSQXY0/Zubo/uGtmoDH+wICUNF8=; b=MxIlKot9oBuS7qpEsnfFWM55d4nmSTnzEnIOQiaq8A/IIXCsulwBUp+/WhRclEanJg gV1KrFYnCmAiOKT4sQi0iaxB9mi5uO9PZEPb3s8Qwo92fTZUgPSuKQyvHaPuXy4HYV5b C7LPi+NONiEQnVqwT07woIsPD+oUcZOHeqXj8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-type:content-transfer-encoding :subject:message-id:date:to:mime-version; bh=tEABGwfEEKdfVk5WoSQXY0/Zubo/uGtmoDH+wICUNF8=; b=fSQ+mCWimLJoPQzM+RSOHks3I4e2p9p84vE6IouDFgtMLM8NMFIVNlJ0AX3mjz/i7h o1UM9J1yxmDCjcPobvtERScO2h8BPGJ9eEfdVNxZ9zmai291p7rIghk+xhkgmdsB891v fhUu0bbw1mdau/N4EYrtq8yldrvLN2qAtX8chpn2pOuUkJTHsD8JcpdMbTvi+FM+x46o rPq+E1j8vK/Zl6JYhvyGXnpeutiVbI3CD5FMlf2j72plB/Ixr3jtSqOKI/B0/xHrsXhF tLJwAX87ZTGqpcRkSmduISkOlyf0FRQshb4Zfajpnk8b7pSJFGoTaWGdeAd+xghlTLKe klpQ== X-Gm-Message-State: ALoCoQlPoHfJ7Oz9Z8Z1rQ91W88M4YdIBUxysgl4n2J7uR98eS4Cn3Bmf6Vabr6Om7f8Qzie6lAx X-Received: by 10.66.66.196 with SMTP id h4mr10927316pat.120.1440722071416; Thu, 27 Aug 2015 17:34:31 -0700 (PDT) Received: from [192.168.128.49] ([66.201.54.106]) by smtp.gmail.com with ESMTPSA id am4sm3673665pbd.58.2015.08.27.17.34.30 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 27 Aug 2015 17:34:30 -0700 (PDT) From: Jim Wong Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Intermittent Disconnects from C Clients Message-Id: <559BB5E2-F0D4-4A50-834D-931CA728E44A@airtime.com> Date: Thu, 27 Aug 2015 17:34:29 -0700 To: user@zookeeper.apache.org Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) X-Mailer: Apple Mail (2.2104) Folks, We are building an application that talks to ZooKeeper using the C = bindings to enable service discovery. Our integration is fairly simple and is = generally working well from a functional standpoint, but we are encountering a problem = with this new application that we have not experienced in the past when we used the = Java bindings. What=E2=80=99s happening is that our servers are occasionally (every = hour or two) being disconnected from ZooKeeper and subsequently need to re-connect. There = doesn=E2=80=99t seem to be any pattern in terms of the timing or frequency of these = occurrences. When the problem occurs, the following is logged: Aug 27 22:47:55 bixby-i-65e767a3-us-west-2 [2015-Aug-27 22:47:55.326399] = [info] [zkLogBridge] [0x00007f7992bff700] 2015-08-27 = 22:47:55,326:7312(0x7f7990fff700):ZOO_DEBUG@zookeeper_process@2264: Got = ping response in 74 ms Aug 27 22:48:01 bixby-i-65e767a3-us-west-2 [2015-Aug-27 22:48:01.993356] = [error] [zkLogBridge] [0x00007f7992bff700] 2015-08-27 = 22:48:01,993:7312(0x7f7990fff700):ZOO_ERROR@handle_socket_error_msg@1643: = Socket [10.0.19.22:2181] zk retcode=3D-7, errno=3D110(Connection timed = out): connection to 10.0.19.22:2181 timed out (exceeded timeout by 0ms) Aug 27 22:48:01 bixby-i-65e767a3-us-west-2 [2015-Aug-27 22:48:01.993409] = [info] [zkClient]{0x7f799e446a40} [0x00007f7992bff700] ZK session = watcher with type=3DEvent:Session state=3DState:Connecting The last of these is our own log indicating that we received the = notification that the connection had been dropped. On the ZooKeeper side of things, the logs look like this: Aug 27 18:36:54 bixby-i-65e767a3-us-west-2 [2015-Aug-27 18:36:54.880795] = [info] [zkLogBridge] [0x00007f7992bff700] 2015-08-27 = 18:36:54,880:7312(0x7f7990fff700):ZOO_INFO@check_events@1752: session = establishment complete on server [52.5.153.158:2181], = sessionId=3D0x34f6b4380d90006, negotiated timeout=3D10000 ... 2015-08-27 22:47:55,283 [myid:2] - DEBUG = [CommitProcessor:2:FinalRequestProcessor@160] - = sessionid:0x34f6b4380d90006 type:ping cxid:0xfffffffffffffffe = zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2015-08-27 22:48:03,095 [myid:2] - DEBUG [ProcessThread(sid:2 = cport:-1)::CommitProcessor@171] - Processing request:: = sessionid:0x34f6b4380d90006 type:ping cxid:0xfffffffffffffffe = zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2015-08-27 22:48:03,095 [myid:2] - DEBUG = [CommitProcessor:2:FinalRequestProcessor@88] - Processing request:: = sessionid:0x34f6b4380d90006 type:ping cxid:0xfffffffffffffffe = zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2015-08-27 22:48:03,095 [myid:2] - DEBUG = [CommitProcessor:2:FinalRequestProcessor@160] - = sessionid:0x34f6b4380d90006 type:ping cxid:0xfffffffffffffffe = zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a2015-08-27 = 22:48:03,095 [myid:2] - WARN = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught = end of stream exception EndOfStreamException: Unable to read additional data from client = sessionid 0x34f6b4380d90006, likely client has closed socket at = org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at = org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.= java:208) at java.lang.Thread.run(Thread.java:745 2015-08-27 22:48:03,095 [myid:2] - INFO = [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed = socket connection for client /54.200.140.196:39721 which had sessionid = 0x34f6b4380d90006 So, it looks like the client got annoyed that it didn=E2=80=99t receive = a response in time from the server, but it also looks like the server was generally responding = pretty quickly. It=E2=80=99s likely that we=E2=80=99re doing something wrong here, but = I=E2=80=99m at a loss as to what that might be. Does anyone have any insights that might explain this = behavior?