From Tim Robertson <timrobertson...@gmail.com>
Subject Re: Expiring session... timeout of 600000ms exceeded
Date Tue, 21 Sep 2010 18:53:55 GMT
Thanks Ted,

> To answer your last question first, no you don't have to do anything
> explicit to keep the ZK connection alive.  It is maintained by a dedicated
> thread.  You do have to keep your java program responsive and ZK problems
> like this almost always indicate that you have a problem with your program
> checking out for extended periods of time.
> My strong guess is that you have something evil happening with your java
> process that is actually causing this delay.
> Since you have tiny memory, it probably isn't GC.  Since you have a bunch of
> processes, swap and process wakeup delays seem plausible.  What is the load
> average on your box?

CPU spikes when responses come in, but mostly it's IO wait on the
endpoints (timeout of 3 minutes).  I suspect HTTP client 4 is dropping
into a retry mechanism though, but have not investigated this yet.

> On the topic of your application, why you are using processes instead of
> threads?  With threads, you can get your memory overhead down to 10's of
> kilobytes as opposed to 10's of megabytes.

I am just prototyping scaling out many processes and potentially
across multiple machines.  Our live crawler runs in a single JVM, but
some of these crawlers take 4-6 weeks, so long running processes block
others, so I was looking at alternatives - our live crawler also uses
DOM based XML parsing so hitting memory limits - SAX would address
this.  Also we want to be able to deploy patches to the crawlers
without interrupting those long running jobs if possible.

> Also, why not use something like Bixo so you don't have to prototype a
> threaded crawler?

It is not a web crawler but more of a custom web service client that
issues queries for pages of data.  A second query is assembled based
on the response of the first.  These are Biodiversity domain specific
protocols DiGIR, TAPIR and BioCASe which are closer to SOAP based
requests / response.  I'll look at Bixo.

Thanks again,

> On Tue, Sep 21, 2010 at 8:24 AM, Tim Robertson <timrobertson100@gmail.com>wrote:
>> Hi all,
>> I am seeing a lot of my clients being kicked out after the 10 minute
>> negotiated timeout is exceeded.
>> My clients are each a JVM (around 100 running on a machine) which are
>> doing web crawling of specific endpoints and handling the response XML
>> - so they do wait around for 3-4 minutes on HTTP timeouts, but
>> certainly not 10 mins.
>> I am just prototyping right now on a 2xquad core mac pro with 12GB
>> memory, and the 100 child processes only get -Xmx64m and I don't see
>> my machine exhausted.
>> Do my clients need to do anything in order to initiate keep alive
>> heart beats or should this be automatic (I thought the ticktime would
>> dictate this)?
>> # my conf is:
>> tickTime=2000
>> dataDir=/Volumes/Data/zookeeper
>> clientPort=2181
>> maxClientCnxns=10000
>> minSessionTimeout=4000
>> maxSessionTimeout=800000
>> Thanks for any pointers to this newbie,
>> Tim

