hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fournier, Camille F. [Tech]" <Camille.Fourn...@gs.com>
Subject RE: number of clients/watchers
Date Thu, 18 Nov 2010 20:06:06 GMT
We tested up to the ulimit (~16K) of connections against a single server and performance was
ok, but I would definitely try to do some serious load testing before I put a system into
production that I knew was going to have that load from the get-go.
The system degrades VERY ungracefully when you hit the ulimit for the process, so be sure
to have enough ensemble nodes to spread those connections across that this won't happen. I
think maybe there's a JIRA out to deal with this issue, not sure what the status is.

C

-----Original Message-----
From: Patrick Hunt [mailto:phunt@apache.org] 
Sent: Thursday, November 18, 2010 2:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: number of clients/watchers

fyi: I haven't heard of anyone running over 10k sessions. I've tried
20k before and had issues, you may want to look at this sooner rather
than later.

* Server gc tuning will be an issue (be sure to use cms/incremental).
* Be sure to disable clients accessing the leader (server configuration param).
* You may need to use the Observers feature to scale out this large.

Patrick

On Thu, Nov 18, 2010 at 10:31 AM, Jeremy Hanna
<jeremy.hanna1234@gmail.com> wrote:
>>> Can you clarify what you mean when you say 10-100K watchers? Do you mean 10-100K
clients with 1 active watch, or some lesser number of clients with more watches, or a few
clients doing a lot of watches and other clients doing other things?
>
> Probably 10-100K clients each with 1 or 2 active watches.  The clients will respond
to watch events and sometimes initiate actions of their own.
>
>> here's a similar test setup I used:
>
> Thanks Patrick - it's really nice to have those numbers and test harness basis.
>
> We're still in architecture mode so some of the details are still in flux, but I think
this gives us an idea.
>
> Thanks very much.
>
> On Nov 18, 2010, at 11:51 AM, Patrick Hunt wrote:
>
>> Camille, that's a very good question. Largest cluster I've heard about
>> is 10k sessions.
>>
>> Jeremy - largest I've ever tested was a 3 server cluster with ~500
>> sessions. Each session created 10k znodes (100bytes each znode) and
>> set 5 watches on each. So 5 million znodes and 25million watches. I
>> then had the sessions delete the znodes and looked for the
>> notifications. They were processed by the clients quite quickly (order
>> of seconds) iirc. Note: this required some GC tuning on the servers to
>> operate correctly (in particular cms and incremental gc was turned on
>> and sufficient memory was allocated for the heaps).
>>
>> here's a similar test setup I used:
>> http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview
>> this is the latency tester tool
>> https://github.com/phunt/zk-smoketest
>>
>> Patrick
>>
>> On Thu, Nov 18, 2010 at 9:44 AM, Fournier, Camille F. [Tech]
>> <Camille.Fournier@gs.com> wrote:
>>> Can you clarify what you mean when you say 10-100K watchers? Do you mean 10-100K
clients with 1 active watch, or some lesser number of clients with more watches, or a few
clients doing a lot of watches and other clients doing other things?
>>>
>>> -----Original Message-----
>>> From: Jeremy Hanna [mailto:jeremy.hanna1234@gmail.com]
>>> Sent: Thursday, November 18, 2010 12:15 PM
>>> To: zookeeper-user@hadoop.apache.org
>>> Subject: number of clients/watchers
>>>
>>> I had a question about number of clients against a zookeeper cluster.  I was
looking at having between 10,000 and 100,000 (towards 100,000) watchers within a single datacenter
at a given time.  Assuming that some fraction of that number are active clients and the r/w
ratio is well within the zookeeper norms, is that number within the realm of possibility for
zookeeper?  We're going to do testing and benchmarking and things, but I didn't want to go
down a rabbit hole if this is simply too much for a single zookeeper cluster to handle.  
The numbers I've seen in blog posts vary and I saw that the observers feature may be useful
in this kind of setting.
>>>
>>> Maybe I'm underestimating zookeeper or maybe I don't have enough information
to tell.  I'm just trying to see if zookeeper is a good fit for our use case.
>>>
>>> Thanks.
>>>
>
>

Mime
View raw message