Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2F78817B82 for ; Fri, 26 Sep 2014 16:49:21 +0000 (UTC) Received: (qmail 80184 invoked by uid 500); 26 Sep 2014 16:49:20 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 80136 invoked by uid 500); 26 Sep 2014 16:49:20 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 80125 invoked by uid 99); 26 Sep 2014 16:49:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Sep 2014 16:49:20 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_SOFTFAIL,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: softfail (athena.apache.org: transitioning domain of ianrose@fullstory.com does not designate 209.85.216.177 as permitted sender) Received: from [209.85.216.177] (HELO mail-qc0-f177.google.com) (209.85.216.177) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Sep 2014 16:49:16 +0000 Received: by mail-qc0-f177.google.com with SMTP id x13so6947797qcv.8 for ; Fri, 26 Sep 2014 09:48:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to :content-type; bh=Dy6eHFj06PgdSeXiAILIGM5wAvnEtZdUJekl3xfl+9w=; b=HWLaqD/qerpJKsmZaEWEOwWR43rCNS03S0Z69v26chfgk94xMfKpykPCrHzw/xoT2E CdSNHy4KPw1LLTXqUiPWClPeOqoBi66MS3Jcu/sTtPjnAzclXcZn9vlNGAsrSGuo0iW+ LdxP6hBzwJV0xEX7bN3gYHfCidMocz1PTXDjUVpyw8FGTbhzKHnlHssJTpl63WsWog/B xU+HTAskUR6F9BNCwemCu5UPfuM9p/iyPhGAp0hGMpLEXegETHilS7MjLhs6b/lnkCSZ TTGHFy9LBwOa6wSeS6q99dCdvOYvpTR5dNm8KUCTQqmGevdwZayn1yDL9Ta57AXJttih GWDg== X-Gm-Message-State: ALoCoQlahpjUIoP7Qn3MANPB5PenCAA1JnULnKOAFCv07Fj5UTSNJrs13eE/nDwN37p1tFL+H3jL X-Received: by 10.224.20.9 with SMTP id d9mr30606728qab.7.1411750135176; Fri, 26 Sep 2014 09:48:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.96.233.103 with HTTP; Fri, 26 Sep 2014 09:48:35 -0700 (PDT) From: Ian Rose Date: Fri, 26 Sep 2014 12:48:35 -0400 Message-ID: Subject: Heartbeats not being received / responded to? To: user@zookeeper.apache.org Content-Type: multipart/alternative; boundary=001a11c1c56051bbc30503fab1a8 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1c56051bbc30503fab1a8 Content-Type: text/plain; charset=UTF-8 Hi all - I've just gotten started using SolrCloud, which uses Zookeeper for coordination; I am otherwise completely new to Zookeeper. Now I am trying to query Zookeeper directly for some simple information. I am finding, however, that although my clients are able to connect they very frequently receive timeouts. It almost seems like the server isn't receiving the heartbeat messages at all (or isn't responding to them). I've seen similar behavior both when using the go-zookeeper and kazoo (python) client libraries (I wanted to try >1 to ensure that it wasn't a client lib problem). My config is very simple: I am running the client and a single Zookeeper node on the same machine (my laptop). There are no other clients of the Zookeeper node while I am running these tests, so there is no practical possibility that the JVM is overloaded or GCing. Here is a very basic kazoo client that I am using. Obviously it isn't doing any "real" work right now - this is just to demonstrate the disconnects. #!/usr/bin/env python2.7 from kazoo.client import KazooClient, KazooState import logging import time def my_listener(state): if state == KazooState.LOST: # Register somewhere that the session was lost logging.warning('handle lost') elif state == KazooState.SUSPENDED: # Handle being disconnected from Zookeeper logging.debug('handle being disconnected') else: # Handle being connected/reconnected to Zookeeper logging.debug('handle being (re)connected') if __name__ == '__main__': logging.basicConfig(format='%(asctime)-15s %(levelname)s %(message)s', level=logging.DEBUG) logging.debug('starting...') zk = KazooClient(hosts='127.0.0.1:2181', timeout=30) zk.start() zk.add_listener(my_listener) time.sleep(35) zk.stop() -------------------- Here is my zookeeper config: # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. dataDir=/Users/ianrose/Code/zookeeper/var/data # the port at which the clients will connect clientPort=2181 # The number of snapshots to retain in dataDir autopurge.snapRetainCount=5 # Purge task interval in hours # Set to "0" to disable auto purge feature autopurge.purgeInterval=1 -------------------- Here is the output I get on the client: 2014-09-26 12:43:20,603 DEBUG starting... 2014-09-26 12:43:20,604 INFO Connecting to 127.0.0.1:2181 2014-09-26 12:43:20,605 DEBUG Sending request(xid=None): Connect(protocol_version=0, last_zxid_seen=0, time_out=30000, session_id=0, passwd='\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', read_only=None) 2014-09-26 12:43:20,609 INFO Zookeeper connection established, state: CONNECTED 2014-09-26 12:43:40,222 WARNING Connection dropped: outstanding heartbeat ping not received 2014-09-26 12:43:40,222 WARNING Transition to CONNECTING 2014-09-26 12:43:40,222 INFO Zookeeper connection lost 2014-09-26 12:43:40,222 DEBUG handle being disconnected 2014-09-26 12:43:40,747 INFO Connecting to 127.0.0.1:2181 2014-09-26 12:43:40,748 DEBUG Sending request(xid=None): Connect(protocol_version=0, last_zxid_seen=0, time_out=30000, session_id=92520388231233540, passwd='\xd1M\xb9\xb3\xae\xab\xa1!@x\x06nv\xb7\xe3*', read_only=None) 2014-09-26 12:43:40,750 INFO Zookeeper connection established, state: CONNECTED 2014-09-26 12:43:40,751 DEBUG handle being (re)connected 2014-09-26 12:43:55,611 DEBUG Sending request(xid=1): Close() 2014-09-26 12:43:55,614 INFO Closing connection to 127.0.0.1:2181 2014-09-26 12:43:55,615 INFO Zookeeper session lost, state: CLOSED 2014-09-26 12:43:55,615 WARNING handle lost And here is trace-level logging from the server side: 2014-09-26 12:43:20,605 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197][] - Accepted socket connection from /127.0.0.1:58959 2014-09-26 12:43:20,605 [myid:] - DEBUG [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@810][] - Session establishment request from client /127.0.0.1:58959 client's lastZxid is 0x0 2014-09-26 12:43:20,605 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868][] - Client attempting to establish new session at /127.0.0.1:58959 2014-09-26 12:43:20,605 [myid:] - TRACE [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:ZooTrace@71][] - SessionTrackerImpl --- Adding session 0x148b2cd8b010004 30000 2014-09-26 12:43:20,606 [myid:] - TRACE [ProcessThread(sid:0 cport:-1)::ZooTrace@90][] - :Psessionid:0x148b2cd8b010004 type:createSession cxid:0x0 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2014-09-26 12:43:20,606 [myid:] - TRACE [ProcessThread(sid:0 cport:-1)::ZooTrace@71][] - SessionTrackerImpl --- Existing session 0x148b2cd8b010004 30000 2014-09-26 12:43:20,608 [myid:] - DEBUG [SyncThread:0:FinalRequestProcessor@88][] - Processing request:: sessionid:0x148b2cd8b010004 type:createSession cxid:0x0 zxid:0x78f txntype:-10 reqpath:n/a 2014-09-26 12:43:20,608 [myid:] - TRACE [SyncThread:0:ZooTrace@90][] - :Esessionid:0x148b2cd8b010004 type:createSession cxid:0x0 zxid:0x78f txntype:-10 reqpath:n/a 2014-09-26 12:43:20,608 [myid:] - TRACE [SyncThread:0:ZooTrace@71][] - SessionTrackerImpl --- Existing session 0x148b2cd8b010004 30000 2014-09-26 12:43:20,608 [myid:] - DEBUG [SyncThread:0:FinalRequestProcessor@160][] - sessionid:0x148b2cd8b010004 type:createSession cxid:0x0 zxid:0x78f txntype:-10 reqpath:n/a 2014-09-26 12:43:20,609 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617][] - Established session 0x148b2cd8b010004 with negotiated timeout 30000 for client /127.0.0.1:58959 2014-09-26 12:43:40,748 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197][] - Accepted socket connection from /127.0.0.1:58960 2014-09-26 12:43:40,749 [myid:] - DEBUG [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@810][] - Session establishment request from client /127.0.0.1:58960 client's lastZxid is 0x0 2014-09-26 12:43:40,749 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861][] - Client attempting to renew session 0x148b2cd8b010004 at /127.0.0.1:58960 2014-09-26 12:43:40,749 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007][] - Closed socket connection for client /127.0.0.1:58959 which had sessionid 0x148b2cd8b010004 2014-09-26 12:43:40,750 [myid:] - TRACE [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:ZooTrace@71][] - Session 0x148b2cd8b010004 is valid: true 2014-09-26 12:43:40,750 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@617][] - Established session 0x148b2cd8b010004 with negotiated timeout 30000 for client /127.0.0.1:58960 2014-09-26 12:43:50,742 [myid:] - DEBUG [SyncThread:0:FinalRequestProcessor@88][] - Processing request:: sessionid:0x148b2cd8b010004 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2014-09-26 12:43:50,742 [myid:] - DEBUG [SyncThread:0:FinalRequestProcessor@160][] - sessionid:0x148b2cd8b010004 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2014-09-26 12:43:55,612 [myid:] - TRACE [ProcessThread(sid:0 cport:-1)::ZooTrace@90][] - :Psessionid:0x148b2cd8b010004 type:closeSession cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2014-09-26 12:43:55,612 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::SessionTrackerImpl@198][] - Session closing: 0x148b2cd8b010004 2014-09-26 12:43:55,612 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@494][] - Processed session termination for sessionid: 0x148b2cd8b010004 2014-09-26 12:43:55,613 [myid:] - DEBUG [SyncThread:0:FinalRequestProcessor@88][] - Processing request:: sessionid:0x148b2cd8b010004 type:closeSession cxid:0x1 zxid:0x790 txntype:-11 reqpath:n/a 2014-09-26 12:43:55,614 [myid:] - TRACE [SyncThread:0:ZooTrace@90][] - :Esessionid:0x148b2cd8b010004 type:closeSession cxid:0x1 zxid:0x790 txntype:-11 reqpath:n/a 2014-09-26 12:43:55,614 [myid:] - TRACE [SyncThread:0:ZooTrace@71][] - SessionTrackerImpl --- Removing session 0x148b2cd8b010004 2014-09-26 12:43:55,614 [myid:] - DEBUG [SyncThread:0:FinalRequestProcessor@160][] - sessionid:0x148b2cd8b010004 type:closeSession cxid:0x1 zxid:0x790 txntype:-11 reqpath:n/a 2014-09-26 12:43:55,614 [myid:] - TRACE [SyncThread:0:NIOServerCnxn@168][] - Add a buffer to outgoingBuffers, sk sun.nio.ch.SelectionKeyImpl@28cfdeb5 is valid: true 2014-09-26 12:43:55,615 [myid:] - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007][] - Closed socket connection for client /127.0.0.1:58960 which had sessionid 0x148b2cd8b010004 I'm assuming that this is due to a misunderstanding on my part. Can anyone point me in the right direction? Much appreciated! - Ian --001a11c1c56051bbc30503fab1a8--