zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Design question
Date Wed, 09 Nov 2011 19:28:18 GMT
Been there!

Having both in memory is the ideal world scenario.  Some days we have to
live in the real world.  The chapter 16 example may still help.

That example is available, btw, at https://github.com/tdunning/Chapter-16

On Wed, Nov 9, 2011 at 10:29 AM, Mark <static.void.dev@gmail.com> wrote:

> Memory constraints of those machines prevent us from being able to load
> two models at the same time.
>
> On 11/8/11 10:10 PM, Ted Dunning wrote:
>
>> Yes.  This definitely could be done with ZK.
>>
>> See chapter 16 of Mahout in Action for an example of how to manage this
>> for
>> a farm of classifiers which have very similar issues (although loading a
>> new model is much faster).
>>
>> One trick that might work is to load the new model before dropping the old
>> one.  You might be able to do a very fast handover that way.
>>
>> On Tue, Nov 8, 2011 at 12:18 PM, Mark<static.void.dev@gmail.com**>
>>  wrote:
>>
>>  I have a general design question regarding ZooKeeper.
>>>
>>> Our use case: We currently have 3 restful recommendation servers that
>>> simply wrap a Mahout GenericBooleanPrefItemBasedRec****ommender. We
>>> started
>>> off using a JDBCDataModel but for performance reasons we had to switch
>>> to a
>>> FileDataModel so everything would be kept in memory. Although now that
>>> our
>>> recommendations service is blazing fast the start up/reloading time for
>>> each of these services are in the minutes. If we try to update all
>>> services
>>> at once then all recommendation requests come to a halt. As a result of
>>> this whenever we push a new model we have to do it in stages... ie
>>> disable
>>> server1, update, wait, renable, disable server2.... We've "automated"
>>> this
>>> using cron by simply updating one server waiting 10 mins then updating
>>> the
>>> next and so on. We are trying to figure out if this coordination would be
>>> better managed via ZooKeeper.
>>>
>>> I've read a bit into ZooKeeper and it seems like it would be easy to set
>>> a
>>> watch on a node to trigger when a model has changed thus triggering a
>>> refresh of our recommender. Where I get lost is how would I coordinate
>>> this
>>> so only one server at a time goes down? When it comes back up then the
>>> next
>>> server should be updated. Can someone please explain how this could be
>>> accomplished? Thanks
>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message