hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: user cousult
Date Fri, 02 Apr 2010 04:49:19 GMT
On 04/01/2010 07:27 PM, li li wrote:
> Dear developer,
>     I am just making research using zookeeper as a load
> balancer.Recently,I plan to test the max load it can handle.But I have
> some confuse about which I must consult to you .
>      Now I can handle about 300 clients with one server,when I set the
> session time out is 300000000.

Whoa, that's way too large. Regardless the server is going to cap the 
max timeout to 20*tickTime (so 40sec in the common case). The larger you 
set the timeout value the longer it will take for your system to notice 
failures. Typically you want a timeout btw 5 and 30 seconds. 5 means you 
are more sensitive to failures, but it also means you are more sensitive 
to transient network glitches. 30 it takes longer to notice when a 
component has died (and therefore longer failover time) but you are much 
less sensitive to network issues. Setting this depends on your 
particular situation/architecture.

Please (re?)read this section on sessions, esp the paragraphs on how the 
timeout works:

It may also be that your client application is swapping or has long GC 
pauses, see this:
esp the section on "frequent client disconnects" and the section on "gc 

>      In your opinion , the session time out is set in which value more
> suitable?
>      And in your experiments, how many clients per server can handle ?
>      what't more,I set the session time out is 30000000 which is a long
> time.but when I run about 300 threads as clients,I get the  err info as
> follows.

I have one team that has 10000 client sessions connected to a single ZK 
cluster, each session is using a 30second timeout. It works fine with 
this load (group membership, master election, load balancing, sharding 
information, etc... all stored in zk)

Also see this document:
you can see that the server is handling quite a load with minimal 
increase in latency (even with 1 cpu). I've pushed this to over 400 
clients with 4million znodes and 20million watches and it worked fine 
(4cpus in that case and 8gig of heap).

If I were you i'd look at swap and gc on clients and server, ensure that 
this is not an issue.

Good luck,


> ********************************************************************************************************************************************************
>    2010-04-02 10:23:59,437 - WARN  [main-SendThread:ClientCnxn$SendThread@967]
> - Exception closing session 0x0 to sun.nio.ch.SelectionKeyImpl@46604660
> java.net.ConnectException: Connection refused: no further information
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:573)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933)
>   **************************************************************************************************************************************************
>    I already set the maxClientCnxns=0.
>     Thanks for your reply ,and I am looking forward the further answer .
>       with best wishes!

View raw message