Thanks Ted,
> To answer your last question first, no you don't have to do anything
> explicit to keep the ZK connection alive. It is maintained by a dedicated
> thread. You do have to keep your java program responsive and ZK problems
> like this almost always indicate that you have a problem with your program
> checking out for extended periods of time.
>
> My strong guess is that you have something evil happening with your java
> process that is actually causing this delay.
>
> Since you have tiny memory, it probably isn't GC. Since you have a bunch of
> processes, swap and process wakeup delays seem plausible. What is the load
> average on your box?
CPU spikes when responses come in, but mostly it's IO wait on the
endpoints (timeout of 3 minutes). I suspect HTTP client 4 is dropping
into a retry mechanism though, but have not investigated this yet.
> On the topic of your application, why you are using processes instead of
> threads? With threads, you can get your memory overhead down to 10's of
> kilobytes as opposed to 10's of megabytes.
I am just prototyping scaling out many processes and potentially
across multiple machines. Our live crawler runs in a single JVM, but
some of these crawlers take 4-6 weeks, so long running processes block
others, so I was looking at alternatives - our live crawler also uses
DOM based XML parsing so hitting memory limits - SAX would address
this. Also we want to be able to deploy patches to the crawlers
without interrupting those long running jobs if possible.
> Also, why not use something like Bixo so you don't have to prototype a
> threaded crawler?
It is not a web crawler but more of a custom web service client that
issues queries for pages of data. A second query is assembled based
on the response of the first. These are Biodiversity domain specific
protocols DiGIR, TAPIR and BioCASe which are closer to SOAP based
requests / response. I'll look at Bixo.
Thanks again,
Tim
>
> On Tue, Sep 21, 2010 at 8:24 AM, Tim Robertson <timrobertson100@gmail.com>wrote:
>
>> Hi all,
>>
>> I am seeing a lot of my clients being kicked out after the 10 minute
>> negotiated timeout is exceeded.
>> My clients are each a JVM (around 100 running on a machine) which are
>> doing web crawling of specific endpoints and handling the response XML
>> - so they do wait around for 3-4 minutes on HTTP timeouts, but
>> certainly not 10 mins.
>> I am just prototyping right now on a 2xquad core mac pro with 12GB
>> memory, and the 100 child processes only get -Xmx64m and I don't see
>> my machine exhausted.
>>
>> Do my clients need to do anything in order to initiate keep alive
>> heart beats or should this be automatic (I thought the ticktime would
>> dictate this)?
>>
>> # my conf is:
>> tickTime=2000
>> dataDir=/Volumes/Data/zookeeper
>> clientPort=2181
>> maxClientCnxns=10000
>> minSessionTimeout=4000
>> maxSessionTimeout=800000
>>
>> Thanks for any pointers to this newbie,
>> Tim
>>
>
|