Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 35089 invoked from network); 7 Sep 2010 06:06:35 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 7 Sep 2010 06:06:35 -0000 Received: (qmail 17112 invoked by uid 500); 7 Sep 2010 06:06:34 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 16766 invoked by uid 500); 7 Sep 2010 06:06:31 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 16757 invoked by uid 99); 7 Sep 2010 06:06:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Sep 2010 06:06:30 +0000 X-ASF-Spam-Status: No, hits=-1997.8 required=10.0 tests=ALL_TRUSTED,HTML_MESSAGE,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.9] (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 07 Sep 2010 06:06:28 +0000 Received: (qmail 35050 invoked by uid 99); 7 Sep 2010 06:06:08 -0000 Received: from localhost.apache.org (HELO mail-ww0-f48.google.com) (127.0.0.1) (smtp-auth username phunt, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Sep 2010 06:06:08 +0000 Received: by wwb39 with SMTP id 39so7220068wwb.29 for ; Mon, 06 Sep 2010 23:06:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.227.129.80 with SMTP id n16mr1039082wbs.104.1283839566386; Mon, 06 Sep 2010 23:06:06 -0700 (PDT) Received: by 10.227.129.139 with HTTP; Mon, 6 Sep 2010 23:06:06 -0700 (PDT) In-Reply-To: References: <1283295350.2587.15.camel@greenlantern.local> Date: Mon, 6 Sep 2010 23:06:06 -0700 Message-ID: Subject: Re: getting created child on NodeChildrenChanged event From: Patrick Hunt To: zookeeper-user@hadoop.apache.org, Mahadev Konar Cc: "todd@spidertracks.co.nz" Content-Type: multipart/alternative; boundary=0016e65aee144d97e2048fa53120 --0016e65aee144d97e2048fa53120 Content-Type: text/plain; charset=ISO-8859-1 It is good to keep things simple, but we have seen some requests related to the client api for children use cases that seem reasonable. In particular the issue of handling large numbers of children efficiently is currently a problem (queue say). We've seen proposals on this before, just no one's followed through with them. I personally think there's room for improvement, perhaps the current client api is too simple: https://issues.apache.org/jira/browse/ZOOKEEPER-423 Patrick On Fri, Sep 3, 2010 at 11:18 PM, Mahadev Konar wrote: > Hi Todd, > We have always tried to lean on the side of keeping things lightweight and > the api simple. The only way you would be able to do this is with > sequential > creates. > > 1. create nodes like /queueelement-$i where i is a monotonically increasing > number. You could use the sequential flag of zookeeper to do this. > > 2. when deleting a node, you would remove the node and create a deleted > node > on > > /deletedqueueelements/queuelement-$i > > 2.1 on notification you would go to /deletedqueelements/ and find out which > ones were deleted. > > The above only works if you are ok with monotonically unique queue > elements. > > 3. the above method allows the folks to see the deltas using > deletedqueuelements, which can be garbage collected by some clean up > process > (you can be smarter abt this as well) > > Would something like this work? > > > Thanks > mahadev > > > On 8/31/10 3:55 PM, "Todd Nine" wrote: > > > Hi Dave, > > Thanks for the response. I understand your point about missed events > > during a watch reset period. I may be off, here is the functionality I > > was thinking. I'm not sure if the ZK internal versioning process could > > possibly support something like this. > > > > 1. A watch is placed on children > > 2. The event is fired to the client. The client receives the Stat > > object as part of the event for the current state of the node when the > > event was created. We'll call this Stat A with version 1 > > 3. The client performs processing. Meanwhile the node has several > > children changed. Versions are incremented to version 2 and version 3 > > 4. Client resets the watch > > 5. A node is added > > 6. The event is fired to the client. Client receives Stat B with > > version 4 > > 7. Client calls performs a deltaChildren(Stat A, Stat B) > > 8. zookeeper returns added nodes between stats, also returns deleted > > nodes between stats. > > > > This would handle the missed event problem since the client would have > > the 2 states it needs to compare. It also allows clients dealing with > > large data sets to only deal with the delta over time (like a git > > replay). Our number of queues could get quite large, and I'm concerned > > that keeping my previous event's children in a set to perform the delta > > may become quite memory and processor intensive Would a feature like > > this be possible without over complicating the Zookeeper core? > > > > > > Thanks, > > Todd > > > > On Tue, 2010-08-31 at 09:23 -0400, Dave Wright wrote: > > > >> Hi Todd - > >> The general explanation for why Zookeeper doesn't pass the event > information > >> w/ the event notification is that an event notification is only > triggered > >> once, and thus may indicate multiple events. For example, if you do a > >> GetChildren and set a watch, then multiple children are added at about > the > >> same time, the first one triggers a notification, but the second (or > later) > >> ones do not. When you do another GetChildren() request to get the list > and > >> reset the watch, you'll see all the changed nodes, however if you had > just > >> been told about the first change in the notification you would have > missed > >> the others. > >> To do what you are wanting, you would really need "persistent" watches > that > >> send notifications every time a change occurs and don't need to be reset > so > >> you can't miss events. That isn't the design that was chosen for > Zookeeper > >> and I don't think it's likely to be implemented. > >> > >> -Dave Wright > >> > >> On Tue, Aug 31, 2010 at 3:49 AM, Todd Nine > wrote: > >> > >>> Hi all, > >>> I'm writing a distributed queue monitoring class for our leader node > in > >>> the cluster. We're queueing messages per input hardware device, this > queue > >>> is then assigned to a node with the least load in our cluster. To do > this, > >>> I maintain 2 Persistent Znode with the following format. > >>> > >>> data queue > >>> > >>> /dataqueue/devices// > >>> > >>> processing follower > >>> > >>> /dataqueue/nodes// > >>> > >>> The queue monitor watches for changes on the path of > /dataqueue/devices. > >>> When the first packet from a unit is received, the queue writer will > >>> create > >>> the queue with the unit id. This triggers the watch event on the > >>> monitoring > >>> class, which in turn creates the znode for the path with the least > loaded > >>> node. This path is watched for child node creation and the node > creates a > >>> queue consumer to consume messages from the new queue. > >>> > >>> > >>> Our list of queues can become quite large, and I would prefer not to > >>> maintain a list of queues I have assigned then perform a delta when the > >>> event fires to determine which queues are new and caused the watch > event. I > >>> can't really use sequenced nodes and keep track of my last read > position, > >>> because I don't want to iterate over the list of queues to determine > which > >>> sequenced node belongs to the current unit id (it would require full > >>> iteration, which really doesn't save me any reads). Is it possible to > >>> create a watch to return the path and Stat of the child node that > caused > >>> the > >>> event to fire? > >>> > >>> Thanks, > >>> Todd > >>> > > > > --0016e65aee144d97e2048fa53120--