Hi,
Thanks Ben. Let me describe the context a bit more - once you know what I'm
trying to do you may have suggestions for how to solve the problem with or
without ZK.
I have a continuous "stream" of documents that I need to process. The stream is
pretty fast (don't know the exact number, but it's many docs a second). Docs
live only in-memory in that stream and cannot be saved to disk at any point up
the stream.
My app listens to this stream. Because of the high document ingestion rate I
need N instances of my app to listen to this stream. So all N apps listen and
they all "get" the same documents, but only 1 app actually processes each
document -- "if (docID mod N == appID) then process doc" -- the usual consistent
hashing stuff. I'd like to be able to add and remove apps dynamically and have
the remaining apps realize that "N" has changed. Similarly, when some app
instance dies and thus "N" changes, I'd like all the remaining instances to know
about it.
If my apps don't know the correct "N" then 1/Nth of docs will go unprocessed (if
the app died or was removed) until the remaining apps adjust their local value
of "N".
> to deal with this applications can use views, which allow clients to
> reconcile differences. for example, if two processes communicate and
Hm, this requires apps to communicate with each other. If each app was aware of
other apps, then I could get the membership count directly using that mechanism,
although I still wouldn't be able to immediately detect when some app died, at
least I'm not sure how I could do that.
> one has a different list of members than the other then they can both
> consult zookeeper to reconcile or use the membership list with the
> highest zxid. the other option is to count on eventually everyone
> converging.
Right, if I could live with eventually having the right "N", then I could use ZK
as described on
http://eng.wealthfront.com/2010/01/actually-implementing-group-management.html
But say I'm OK with "eventually everyone converging". Can I use ZK then? And
if so, how "eventually" is this "eventually"? That is, if an app dies, how
quickly can ZK notify all znode watchers that znode change? A few milliseconds
or more?
In general, how does one deal with situations like the one I described above,
where each app is responsible for 1/Nth of work and where N can uncontrollably
and unexpectedly change?
Thanks!
Otis
----- Original Message ----
> From: Benjamin Reed <breed@apache.org>
> To: user@zookeeper.apache.org
> Sent: Fri, March 18, 2011 5:59:43 PM
> Subject: Re: Using ZK for real-time group membership notification
>
> in a distributed setting such an answer is impossible. especially
> given the theory of relativity and the speed of light. a machine may
> fail right after sending a heart beat or another may come online right
> after sending a report. even if zookeeper could provide this you would
> still have thread scheduling issues on a local machine that means that
> you are operating on old information.
>
> to deal with this applications can use views, which allow clients to
> reconcile differences. for example, if two processes communicate and
> one has a different list of members than the other then they can both
> consult zookeeper to reconcile or use the membership list with the
> highest zxid. the other option is to count on eventually everyone
> converging.
>
> i would not develop a distributed system with the assumption that "all
> group members know *the exact number of members at all times*".
>
> ben
>
> On Fri, Mar 18, 2011 at 2:02 PM, Otis Gospodnetic
> <otis_gospodnetic@yahoo.com> wrote:
> > Hi,
> >
> > Short version:
> > How can ZK be used to make sure that all group members know *the exact
>number of
> > members at all times*?
> >
> > I have an app that can be run on 1 or more servers. New instances of the
>app
> > come and go, may die, etc. -- the number of the app instances is completely
> > dynamic. At any one time, as these apps come and go, each live instance of
>the
> > app needs to know how many instances are there total. If a new instance of
>the
> > app is started, all instances need to know the new total number of
>instances.
> > If an app is stopped or if it dies, the remaining apps need to know the new
> > number of app instances.
> >
> > Also, and this is critical, they need to know about these additions/removals
>of
> > apps right away and they all need to find out them at the same time.
>Basically,
> > all members of some group need to know *the exact number of members at all
> > times*.
> >
> > This sounds almost like we need to watch a "parent group znode" and monitor
>the
> > number of its ephemeral children, which represent each app instance that is
> > watching the "parent groups znode". Is that right? If so, then all I'd
>need to
> > know is the answer to "How many watchers are watching this znode?" of "How
>many
> > kids does this znode have?". And I'd need ZK to notify all watchers whenever
>the
> > answer to this question changes. Ideally it would send/push the answer
(the
> > number of watchers) to all watchers, but if not, I assume any watcher that
>is
> > notified about the change would go poll ZK to get the number of ephemeral
>kids.
> >
> > I think the above is essentially what's described on
> >
>http://eng.wealthfront.com/2010/01/actually-implementing-group-management.html ,
> > but doesn't answer the part that's critical for me (the very first Q up
>above).
> >
> > Thanks,
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
> >
>
|