From zookeeper-user-return-2234-apmail-hadoop-zookeeper-user-archive=hadoop.apache.org@hadoop.apache.org Tue Sep 21 18:10:38 2010 Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 29759 invoked from network); 21 Sep 2010 18:10:37 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 Sep 2010 18:10:37 -0000 Received: (qmail 56807 invoked by uid 500); 21 Sep 2010 18:10:37 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 56719 invoked by uid 500); 21 Sep 2010 18:10:36 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 56698 invoked by uid 99); 21 Sep 2010 18:10:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Sep 2010 18:10:36 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ted.dunning@gmail.com designates 74.125.83.176 as permitted sender) Received: from [74.125.83.176] (HELO mail-pv0-f176.google.com) (74.125.83.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Sep 2010 18:10:29 +0000 Received: by pvc22 with SMTP id 22so2689869pvc.35 for ; Tue, 21 Sep 2010 11:10:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=FIWeQ/DRfOTcwrkoA7eeXTT8rogR2bs/hA2nQVLCdiw=; b=DR475fax1EPUMWp3uRpOE6ivHAwf5mCZsjuDIDlRy9Fk8SgALm2uHnrGuKgun5p1gM D5fOr/O4EwUtpK3x8w8Sv1ZW+xg+k2a4lNKvhLdqclqrhc9UUHbg11b0X/XajM2foa6L VWOkBbpt4myfpumnczu7e8U6/L+v9mQoG1wPw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=ghVWAA2BtSQLUaWAgjdmda9ED2Wc41VWGQhMuy1V7g56UsZ6Y9/woUkU0FMc+0Dbzq Oh/A7MEequh7/9eGCQvwj12JfySnIOukoiJfID5hpCBORJ4NRv/G/vDicRVTpjVyFiVg 2ZtHECX5KqiGxWZCEEigbOmJZpvUigINhspUw= Received: by 10.142.223.20 with SMTP id v20mr8342772wfg.284.1285092609015; Tue, 21 Sep 2010 11:10:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.182.130 with HTTP; Tue, 21 Sep 2010 11:09:47 -0700 (PDT) In-Reply-To: References: From: Ted Dunning Date: Tue, 21 Sep 2010 11:09:47 -0700 Message-ID: Subject: Re: Expiring session... timeout of 600000ms exceeded To: zookeeper-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=000e0cd28d4a76c8530490c8f0ac --000e0cd28d4a76c8530490c8f0ac Content-Type: text/plain; charset=UTF-8 To answer your last question first, no you don't have to do anything explicit to keep the ZK connection alive. It is maintained by a dedicated thread. You do have to keep your java program responsive and ZK problems like this almost always indicate that you have a problem with your program checking out for extended periods of time. My strong guess is that you have something evil happening with your java process that is actually causing this delay. Since you have tiny memory, it probably isn't GC. Since you have a bunch of processes, swap and process wakeup delays seem plausible. What is the load average on your box? On the topic of your application, why you are using processes instead of threads? With threads, you can get your memory overhead down to 10's of kilobytes as opposed to 10's of megabytes. Also, why not use something like Bixo so you don't have to prototype a threaded crawler? On Tue, Sep 21, 2010 at 8:24 AM, Tim Robertson wrote: > Hi all, > > I am seeing a lot of my clients being kicked out after the 10 minute > negotiated timeout is exceeded. > My clients are each a JVM (around 100 running on a machine) which are > doing web crawling of specific endpoints and handling the response XML > - so they do wait around for 3-4 minutes on HTTP timeouts, but > certainly not 10 mins. > I am just prototyping right now on a 2xquad core mac pro with 12GB > memory, and the 100 child processes only get -Xmx64m and I don't see > my machine exhausted. > > Do my clients need to do anything in order to initiate keep alive > heart beats or should this be automatic (I thought the ticktime would > dictate this)? > > # my conf is: > tickTime=2000 > dataDir=/Volumes/Data/zookeeper > clientPort=2181 > maxClientCnxns=10000 > minSessionTimeout=4000 > maxSessionTimeout=800000 > > Thanks for any pointers to this newbie, > Tim > --000e0cd28d4a76c8530490c8f0ac--