Hi Otis,
Would it be possible to change your app to use producer-consumer queue? That way,
no document will go unprocessed when an instance goes down.
http://zookeeper.apache.org/doc/r3.3.2/zookeeperTutorial.html#sc_producerConsumerQueues
--Michi
On Mar 18, 2011, at 9:41 PM, Otis Gospodnetic wrote:
> Hi,
>
> Thanks Ben. Let me describe the context a bit more - once you know what I'm
> trying to do you may have suggestions for how to solve the problem with or
> without ZK.
>
> I have a continuous "stream" of documents that I need to process. The stream is
> pretty fast (don't know the exact number, but it's many docs a second). Docs
> live only in-memory in that stream and cannot be saved to disk at any point up
> the stream.
>
> My app listens to this stream. Because of the high document ingestion rate I
> need N instances of my app to listen to this stream. So all N apps listen and
> they all "get" the same documents, but only 1 app actually processes each
> document -- "if (docID mod N == appID) then process doc" -- the usual consistent
> hashing stuff. I'd like to be able to add and remove apps dynamically and have
> the remaining apps realize that "N" has changed. Similarly, when some app
> instance dies and thus "N" changes, I'd like all the remaining instances to know
> about it.
>
> If my apps don't know the correct "N" then 1/Nth of docs will go unprocessed (if
> the app died or was removed) until the remaining apps adjust their local value
> of "N".
>
>> to deal with this applications can use views, which allow clients to
>> reconcile differences. for example, if two processes communicate and
>
> Hm, this requires apps to communicate with each other. If each app was aware of
> other apps, then I could get the membership count directly using that mechanism,
> although I still wouldn't be able to immediately detect when some app died, at
> least I'm not sure how I could do that.
>
>> one has a different list of members than the other then they can both
>> consult zookeeper to reconcile or use the membership list with the
>> highest zxid. the other option is to count on eventually everyone
>> converging.
>
> Right, if I could live with eventually having the right "N", then I could use ZK
> as described on
> http://eng.wealthfront.com/2010/01/actually-implementing-group-management.html
>
> But say I'm OK with "eventually everyone converging". Can I use ZK then? And
> if so, how "eventually" is this "eventually"? That is, if an app dies, how
> quickly can ZK notify all znode watchers that znode change? A few milliseconds
> or more?
>
> In general, how does one deal with situations like the one I described above,
> where each app is responsible for 1/Nth of work and where N can uncontrollably
> and unexpectedly change?
>
> Thanks!
> Otis
>
>
>
>
>
> ----- Original Message ----
>> From: Benjamin Reed <breed@apache.org>
>> To: user@zookeeper.apache.org
>> Sent: Fri, March 18, 2011 5:59:43 PM
>> Subject: Re: Using ZK for real-time group membership notification
>>
>> in a distributed setting such an answer is impossible. especially
>> given the theory of relativity and the speed of light. a machine may
>> fail right after sending a heart beat or another may come online right
>> after sending a report. even if zookeeper could provide this you would
>> still have thread scheduling issues on a local machine that means that
>> you are operating on old information.
>>
>> to deal with this applications can use views, which allow clients to
>> reconcile differences. for example, if two processes communicate and
>> one has a different list of members than the other then they can both
>> consult zookeeper to reconcile or use the membership list with the
>> highest zxid. the other option is to count on eventually everyone
>> converging.
>>
>> i would not develop a distributed system with the assumption that "all
>> group members know *the exact number of members at all times*".
>>
>> ben
>>
>> On Fri, Mar 18, 2011 at 2:02 PM, Otis Gospodnetic
>> <otis_gospodnetic@yahoo.com> wrote:
>>> Hi,
>>>
>>> Short version:
>>> How can ZK be used to make sure that all group members know *the exact
>> number of
>>> members at all times*?
>>>
>>> I have an app that can be run on 1 or more servers. New instances of the
>> app
>>> come and go, may die, etc. -- the number of the app instances is completely
>>> dynamic. At any one time, as these apps come and go, each live instance of
>> the
>>> app needs to know how many instances are there total. If a new instance of
>> the
>>> app is started, all instances need to know the new total number of
>> instances.
>>> If an app is stopped or if it dies, the remaining apps need to know the new
>>> number of app instances.
>>>
>>> Also, and this is critical, they need to know about these additions/removals
>> of
>>> apps right away and they all need to find out them at the same time.
>> Basically,
>>> all members of some group need to know *the exact number of members at all
>>> times*.
>>>
>>> This sounds almost like we need to watch a "parent group znode" and monitor
>> the
>>> number of its ephemeral children, which represent each app instance that is
>>> watching the "parent groups znode". Is that right? If so, then all I'd
>> need to
>>> know is the answer to "How many watchers are watching this znode?" of "How
>> many
>>> kids does this znode have?". And I'd need ZK to notify all watchers whenever
>> the
>>> answer to this question changes. Ideally it would send/push the answer
> (the
>>> number of watchers) to all watchers, but if not, I assume any watcher that
>> is
>>> notified about the change would go poll ZK to get the number of ephemeral
>> kids.
>>>
>>> I think the above is essentially what's described on
>>>
>> http://eng.wealthfront.com/2010/01/actually-implementing-group-management.html ,
>>> but doesn't answer the part that's critical for me (the very first Q up
>> above).
>>>
>>> Thanks,
>>> Otis
>>> ----
>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>> Lucene ecosystem search :: http://search-lucene.com/
>>>
>>>
>>
|