zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Using ZK for real-time group membership notification
Date Sat, 19 Mar 2011 04:41:18 GMT
Hi,

Thanks Ben.  Let me describe the context a bit more - once you know what I'm 
trying to do you may have suggestions for how to solve the problem with or 
without ZK.

I have a continuous "stream" of documents that I need to process.  The stream is 
pretty fast (don't know the exact number, but it's many docs a second).  Docs 
live only in-memory in that stream and cannot be saved to disk at any point up 
the stream.

My app listens to this stream.  Because of the high document ingestion rate I 
need N instances of my app to listen to this stream.  So all N apps listen and 
they all "get" the same documents, but only 1 app actually processes each 
document -- "if (docID mod N == appID) then process doc" -- the usual consistent 
hashing stuff.  I'd like to be able to add and remove apps dynamically and have 
the remaining apps realize that "N" has changed.  Similarly, when some app 
instance dies and thus "N" changes, I'd like all the remaining instances to know 
about it.

If my apps don't know the correct "N" then 1/Nth of docs will go unprocessed (if 
the app died or was removed) until the remaining apps adjust their local value 
of "N".

> to deal with this applications can use views, which allow  clients to
> reconcile differences. for example, if two processes communicate  and

Hm, this requires apps to communicate with each other.  If each app was aware of 
other apps, then I could get the membership count directly using that mechanism, 
although I still wouldn't be able to immediately detect when some app died, at 
least I'm not sure how I could do that.

> one has a different list of members than the other then they can  both
> consult zookeeper to reconcile or use the membership list with  the
> highest zxid. the other option is to count on eventually  everyone
> converging.

Right, if I could live with eventually having the right "N", then I could use ZK 
as described on 
http://eng.wealthfront.com/2010/01/actually-implementing-group-management.html

But say I'm OK with "eventually everyone converging".  Can I use ZK then?  And 
if so, how "eventually" is this "eventually"?  That is, if an app dies, how 
quickly can ZK notify all znode watchers that znode change? A few milliseconds 
or more?

In general, how does one deal with situations like the one I described above, 
where each app is responsible for 1/Nth of work and where N can uncontrollably 
and unexpectedly change?

Thanks!
Otis





----- Original Message ----
> From: Benjamin Reed <breed@apache.org>
> To: user@zookeeper.apache.org
> Sent: Fri, March 18, 2011 5:59:43 PM
> Subject: Re: Using ZK for real-time group membership notification
> 
> in a distributed setting such an answer is impossible. especially
> given the  theory of relativity and the speed of light. a machine may
> fail right after  sending a heart beat or another may come online right
> after sending a report.  even if zookeeper could provide this you would
> still have thread scheduling  issues on a local machine that means that
> you are operating on old  information.
> 
> to deal with this applications can use views, which allow  clients to
> reconcile differences. for example, if two processes communicate  and
> one has a different list of members than the other then they can  both
> consult zookeeper to reconcile or use the membership list with  the
> highest zxid. the other option is to count on eventually  everyone
> converging.
> 
> i would not develop a distributed system with the  assumption that "all
> group members know *the exact number of  members at  all times*".
> 
> ben
> 
> On Fri, Mar 18, 2011 at 2:02 PM, Otis  Gospodnetic
> <otis_gospodnetic@yahoo.com>  wrote:
> > Hi,
> >
> > Short version:
> > How can ZK be used to  make sure that all group members know *the exact 
>number of
> > members at  all times*?
> >
> > I have an app that can be run on 1 or more servers.   New instances of the 
>app
> > come and go, may die, etc. -- the number of  the app instances is completely
> > dynamic.  At any one time, as these apps  come and go, each live instance of 
>the
> > app needs to know how many  instances are there total.  If a new instance of 
>the
> > app is started, all  instances need to know the new total number of 
>instances.
> > If an app is  stopped or if it dies, the remaining apps need to know the new
> > number of  app instances.
> >
> > Also, and this is critical, they need to know  about these additions/removals 
>of
> > apps right away and they all need to  find out them at the same time. 
>Basically,
> > all members of some group  need to know *the exact number of members at all
> > times*.
> >
> >  This sounds almost like we need to watch a "parent group znode" and monitor  
>the
> > number of its ephemeral children, which represent each app instance  that is
> > watching the "parent groups znode".  Is that right?  If so, then  all I'd 
>need to
> > know is the answer to "How many watchers are watching  this znode?" of "How 
>many
> > kids does this znode have?". And I'd need ZK  to notify all watchers whenever 
>the
> > answer to this question changes.   Ideally it would send/push the answer 
(the
> > number of watchers) to all  watchers, but if not, I assume any watcher that 
>is
> > notified about the  change would go poll ZK to get the number of ephemeral 
>kids.
> >
> > I  think the above is essentially what's described on
> > 
>http://eng.wealthfront.com/2010/01/actually-implementing-group-management.html ,
> > but doesn't answer the part that's critical for me (the very first Q  up 
>above).
> >
> > Thanks,
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
> >
> 

Mime
View raw message