Return-Path: Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: (qmail 48023 invoked from network); 3 Jan 2011 21:30:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Jan 2011 21:30:38 -0000 Received: (qmail 2937 invoked by uid 500); 3 Jan 2011 21:30:38 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 2913 invoked by uid 500); 3 Jan 2011 21:30:38 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 2905 invoked by uid 99); 3 Jan 2011 21:30:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Jan 2011 21:30:38 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ted.dunning@gmail.com designates 74.125.83.42 as permitted sender) Received: from [74.125.83.42] (HELO mail-gw0-f42.google.com) (74.125.83.42) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Jan 2011 21:30:31 +0000 Received: by gwb20 with SMTP id 20so9769607gwb.15 for ; Mon, 03 Jan 2011 13:30:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=78xh+iH9Jhl6VmyzD1hrjifReGtvJdODg9t3SPuByRs=; b=UcXUqgfIqMiSsWbiiIZWlsNBMqwuRB/b8m8MSEyQnp6OmePg2k5tT5l1n9JvMKsmi+ LxVnh8XhvgFmIVAaD8Hrev1uKIzJIj7KrTrtvieZaeEOd5VFCWtvl1aj1l31g0BKxtJ8 uUzzf9LbNDi6VvRfSOEB/PB34IM91ap8Sat7E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=lHTh8VVkmosOVopnsm+iGelzK9lvmYTziJmj6Th1qq1oWHXqUs9yyp9PYcjwkqiBcU FebYFqVVYRnx1H3s0Un8ciOvP3Cqg2xbiOUaE2UO+zuZXHcR3iEjwUNv6LAHijQcApKP Q4OnJiexUDCwmYZYm+aW29wlrJ4jQsTAt0de0= Received: by 10.236.111.39 with SMTP id v27mr3325959yhg.43.1294090210526; Mon, 03 Jan 2011 13:30:10 -0800 (PST) MIME-Version: 1.0 Received: by 10.236.103.17 with HTTP; Mon, 3 Jan 2011 13:29:46 -0800 (PST) In-Reply-To: References: From: Ted Dunning Date: Mon, 3 Jan 2011 13:29:46 -0800 Message-ID: Subject: Re: performance of watches To: user@zookeeper.apache.org Content-Type: multipart/alternative; boundary=0023547c96c94e4da00498f7dbdf X-Virus-Checked: Checked by ClamAV on apache.org --0023547c96c94e4da00498f7dbdf Content-Type: text/plain; charset=UTF-8 Btw... this is one of the motives for multi-update. On Mon, Jan 3, 2011 at 12:54 PM, Mahadev Konar wrote: > Sam, > I think the approach ted described should have response time of under > seconds, and I think is probably a more reasonable one for scaling up. > > Thanks > mahadev > > On 12/16/10 10:17 PM, "Samuel Rash" wrote: > > > Can these approaches respond in under a few seconds? If a traffic source > > remains unclaimed for even a short while, we have a problem. > > > > Also, a host may "shed" traffic manually be releasing a subset of its > > paths. In this way, all the other hosts watching only its location does > > prevent against the herd when it dies, but how do they know when it > > releases 50/625 traffic buckets? > > > > I agree we might be able to make a more intelligent design that trades > > latency for watch efficiency, but the idea was that we'd use the simplest > > approach that gave us the lowest latency *if* the throughput of watches > > from zookeeper was sufficient (and it seems like it is from Mahadev's > link) > > > > Thx, > > -sr > > > > On 12/16/10 9:58 PM, "Ted Dunning" wrote: > > > >> This really sounds like it might be refactored a bit to decrease the > >> number > >> of notifications and reads. > >> > >> In particular, it sounds like you have two problems. > >> > >> The first is that the 40 hosts need to claim various traffic sources, > one > >> per traffic source, many sources per host. This is well solved by the > >> standard winner takes all file create idiom. > >> > >> The second problem is that other hosts need to know when traffic sources > >> need claiming. > >> > >> I think you might consider an approach to the second problem which has > >> each > >> host posting a single ephemeral file containing a list of all of the > >> sources > >> it has claimed. Whenever a host claims a new service, it can update > this > >> file. When a host dies or exits, all the others will wake due to having > a > >> watch on the directory containing these ephemerals, will read the > >> remaining > >> host/source lists and determine which services are insufficiently > covered. > >> There will need to be some care taken about race conditions on this, but > >> I > >> think they all go the right way. > >> > >> This means that a host dying will cause 40 notifications followed by > 1600 > >> reads and at most 40 attempts at file creates. You might even be able > to > >> avoid the 1600 reads by having each of the source directories be watched > >> by > >> several of the 40 hosts. Then a host dying would cause just a few > >> notifications and a few file creates. > >> > >> A background process on each node could occasionally scan the service > >> lists > >> for each host to make sure nothing drops through the cracks. > >> > >> This seems much more moderate than what you describe. > >> > >> On Thu, Dec 16, 2010 at 8:23 PM, Samuel Rash wrote: > >> > >>> Yea--one host going down should trigger 24k watches. Each host then > >>> looks > >>> at its load and determines which paths to acquire (they represent > >>> traffic > >>> flow). This could result in, at worst, 24k create() attempts > >>> immediately > >>> after. > >>> > >>> I'll read the docs--Thanks > >>> > >>> -sr > >>> > >>> On 12/16/10 8:06 PM, "Mahadev Konar" wrote: > >>> > >>>> Hi Sam, > >>>> Just a clarifiaction, will a host going down fire 625 * 39 watches? > >>> That > >>>> is ~ 24000 watches per host being down. > >>>> > >>>> You can take a look at > >>>> http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview about > >>>> watches and latencies and hw requirements. Please do take a look and > if > >>>> it doesn't answer your questions, we should add more documentation. > >>>> > >>>> Thanks > >>>> Mahadev > >>>> > >>>> On 12/16/10 7:42 PM, "Samuel Rash" wrote: > >>>> > >>>> Hello, > >>>> > >>>> I am looking to run about 40 zookeeper clients with the following > watch > >>>> properties: > >>>> > >>>> 1. Up to 25,000 paths that every host has a watch on (each path has > one > >>>> child and the watch is one for that child, an ephemeral node, being > >>>> removed) > >>>> 2. An individual host "owns" 625 of these paths in this example; one > >>> going > >>>> down will fire 625 watches to the other 39 hosts > >>>> > >>>> Is there any limit on the rate at which these watches can be sent off? > >>>> What's the right size cluster? (3? 5?) Does it need to be dedicated > >>> hw? > >>>> > >>>> Thanks, > >>>> Sam > >>>> > >>>> > >>> > >>> > > > > > > --0023547c96c94e4da00498f7dbdf--