zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahadev Konar <maha...@yahoo-inc.com>
Subject Re: performance of watches
Date Mon, 03 Jan 2011 20:54:26 GMT
Sam,
 I think the approach ted described should have response time of under
seconds, and I think is probably a more reasonable one for scaling up.

Thanks
mahadev

On 12/16/10 10:17 PM, "Samuel Rash" <sl.rash@fb.com> wrote:

> Can these approaches respond in under a few seconds? If a traffic source
> remains unclaimed for even a short while, we have a problem.
> 
> Also, a host may "shed" traffic manually be releasing a subset of its
> paths.  In this way, all the other hosts watching only its location does
> prevent against the herd when it dies, but how do they know when it
> releases 50/625 traffic buckets?
> 
> I agree we might be able to make a more intelligent design that trades
> latency for watch efficiency, but the idea was that we'd use the simplest
> approach that gave us the lowest latency *if* the throughput of watches
> from zookeeper was sufficient (and it seems like it is from Mahadev's link)
> 
> Thx,
> -sr
> 
> On 12/16/10 9:58 PM, "Ted Dunning" <ted.dunning@gmail.com> wrote:
> 
>> This really sounds like it might be refactored a bit to decrease the
>> number
>> of notifications and reads.
>> 
>> In particular, it sounds like you have two problems.
>> 
>> The first is that the 40 hosts need to claim various traffic sources, one
>> per traffic source, many sources per host.  This is well solved by the
>> standard winner takes all file create idiom.
>> 
>> The second problem is that other hosts need to know when traffic sources
>> need claiming.
>> 
>> I think you might consider an approach to the second problem which has
>> each
>> host posting a single ephemeral file containing a list of all of the
>> sources
>> it has claimed.  Whenever a host claims a new service, it can update this
>> file.  When a host dies or exits, all the others will wake due to having a
>> watch on the directory containing these ephemerals, will read the
>> remaining
>> host/source lists and determine which services are insufficiently covered.
>> There will need to be some care taken about race conditions on this, but
>> I
>> think they all go the right way.
>> 
>> This means that a host dying will cause 40 notifications followed by 1600
>> reads and at most 40 attempts at file creates.   You might even be able to
>> avoid the 1600 reads by having each of the source directories be watched
>> by
>> several of the 40 hosts.  Then a host dying would cause just a few
>> notifications and a few file creates.
>> 
>> A background process on each node could occasionally scan the service
>> lists
>> for each host to make sure nothing drops through the cracks.
>> 
>> This seems much more moderate than what you describe.
>> 
>> On Thu, Dec 16, 2010 at 8:23 PM, Samuel Rash <sl.rash@fb.com> wrote:
>> 
>>> Yea--one host going down should trigger 24k watches.  Each host then
>>> looks
>>> at its load and determines which paths to acquire (they represent
>>> traffic
>>> flow).  This could result in, at worst, 24k create() attempts
>>> immediately
>>> after.
>>> 
>>> I'll read the docs--Thanks
>>> 
>>> -sr
>>> 
>>> On 12/16/10 8:06 PM, "Mahadev Konar" <mahadev@yahoo-inc.com> wrote:
>>> 
>>>> Hi Sam,
>>>> Just a clarifiaction, will a host going down fire 625 * 39 watches?
>>> That
>>>> is ~ 24000 watches per host being down.
>>>> 
>>>> You can take a look at
>>>> http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview about
>>>> watches and latencies and hw requirements. Please do take a look and if
>>>> it doesn't answer your questions, we should add more documentation.
>>>> 
>>>> Thanks
>>>> Mahadev
>>>> 
>>>> On 12/16/10 7:42 PM, "Samuel Rash" <sl.rash@fb.com> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> I am looking to run about 40 zookeeper clients with the following watch
>>>> properties:
>>>> 
>>>> 1. Up to 25,000 paths that every host has a watch on (each path has one
>>>> child and the watch is one for that child, an ephemeral node, being
>>>> removed)
>>>> 2. An individual host "owns" 625 of these paths in this example; one
>>> going
>>>> down will fire 625 watches to the other 39 hosts
>>>> 
>>>> Is there any limit on the rate at which these watches can be sent off?
>>>> What's the right size cluster? (3? 5?)  Does it need to be dedicated
>>> hw?
>>>> 
>>>> Thanks,
>>>> Sam
>>>> 
>>>> 
>>> 
>>> 
> 
> 


Mime
View raw message