incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Melo <andrew.m...@gmail.com>
Subject Re: About denormalization and keep consistent
Date Fri, 09 Apr 2010 19:24:24 GMT
On Fri, Apr 9, 2010 at 2:22 PM, faust 1111 <faust451@gmail.com> wrote:
>>Author changes their profile document.
>
>>Something listening to the _changes feed notices this.
>
>>It starts a process querying for docs that have the old author profile, and fixing
those docs. Eventually all the >docs have been updated. Then its work is done.
>
> But how
>>Something listening to the _changes feed notices this.
> know about old author name?

You have to tell it somehow. You can either ninja together some sort
of way to do it through couch or do something out-of-band to notify
your daemon process of what work it needs to do.

best,
Andrew


>
>
> 2010/4/9 J Chris Anderson <jchris@gmail.com>:
>>
>> On Apr 9, 2010, at 11:44 AM, faust 1111 wrote:
>>
>>> Thanks Chris.
>>>
>>>> Your idea to do it in the app server, as the author changes their master
record, is troubling because it can lead to race conditions. The changes method, where a name-update
is an asynchronous process, is more robust, because you can know for sure that *eventually*
the author's name will be changed everywhere it appears.
>>>>
>>>
>>> You told about more robust method, you mean run backend process listen
>>> _changes feed for Authors,
>>> and when changes come try update all contents related to author?
>>> If i get you right, i don't understand you
>>>> because you can know for sure that *eventually* the author's name will be
changed everywhere it appears.
>>> What you mean?
>>>
>>
>> the basic pattern is:
>>
>> Author changes their profile document.
>>
>> Something listening to the _changes feed notices this.
>>
>> It starts a process querying for docs that have the old author profile, and fixing
those docs. Eventually all the docs have been updated. Then its work is done.
>>
>> The only complication is, maybe a write with the old (stale) author profile comes
in much later (for some reason) so maybe you want to recheck any of those background processes
that you spawned, again an hour after they complete. or something. with a small site you don't
have to worry about this but if you are operating at web scale it will start to matter.
>>
>>>
>>>> My method is to have a view of docs by author, and then query that view for
the old author's name, updating any docs that appear. This way if new writes come in with
the old name (due to there being out of date replicas of the master record lingering, for
instance) they will be eventually updated as well. You could have a time-to-live of something
like 5 minutes (or longer if your system is giant) for the process which is running the query
for docs-that-say-Joe-but-should-say-Joseph and updating them.
>>>>
>>>
>>> Probably i don't get you:
>>> I track _changes feed, when change come you suggest query that view
>>> for the old author's name, but how i know old name, author doc all
>>> ready with newest name.
>>>
>>> Sorry for getting your time, to answer for stupid questions.
>>>
>>>
>>> 2010/4/9 J Chris Anderson <jchris@gmail.com>:
>>>>
>>>> On Apr 8, 2010, at 11:55 PM, faust 1111 wrote:
>>>>
>>>>> Yes i understand that listen _changes is better to get round race conditions.
>>>>>
>>>>> Cannot get your suggesting about
>>>>> how i can track that all contents related to author was updated not 5
>>>>> of 50 but all.
>>>>>
>>>>
>>>> I think your question is valid. The answer is also simple. There is no way
to transactionally ensure that the author's name is updated everywhere it appears.
>>>>
>>>> Your idea to do it in the app server, as the author changes their master
record, is troubling because it can lead to race conditions. The changes method, where a name-update
is an asynchronous process, is more robust, because you can know for sure that *eventually*
the author's name will be changed everywhere it appears.
>>>>
>>>> It is probably best to make this clear through the UI with a message like:
"Your name has been changed in the master record. It could take a few minutes for the change
to appear throughout the site."
>>>>
>>>> In actuality, this is probably no different than in a relational database
(as in a relational database, you'd probably have a caching layer that takes a few minutes
to expire anyway.)
>>>>
>>>>> Thats ok.
>>>>> I don't understand if listen feed _chenges, feed give me info only
>>>>> about id & rev of changed doc, how i can get that author name is
>>>>> changed?
>>>>>
>>>>
>>>> My method is to have a view of docs by author, and then query that view for
the old author's name, updating any docs that appear. This way if new writes come in with
the old name (due to there being out of date replicas of the master record lingering, for
instance) they will be eventually updated as well. You could have a time-to-live of something
like 5 minutes (or longer if your system is giant) for the process which is running the query
for docs-that-say-Joe-but-should-say-Joseph and updating them.
>>>>
>>>> _changes is just a convenient way to trigger that view query (so that you
aren't polling the view when nothing has happened in the database.) With filtered changes,
you can even be sure that you are only polling the view when there will be something relevant
to see. However, all this _changes stuff is really just an optimization over brute force polling
the view once every N seconds, so you can add it later, when your app is big enough that load
starts to matter.
>>>>
>>>> Chris
>>>>
>>>>>
>>>>> 2010/4/9 Nicholas Orr <nicholas.orr@zxgen.net>:
>>>>>> i don't think you are getting what the above people are suggesting...
>>>>>>
>>>>>> Go read up on the _changes API :)
>>>>>>
>>>>>> The basics are, every single change in the database is pushed into
this
>>>>>> feed. All race conditions that are caused by your ruby way (via the
filter)
>>>>>> are averted :)
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>> On Fri, Apr 9, 2010 at 4:34 AM, faust 1111 <faust451@gmail.com>
wrote:
>>>>>>
>>>>>>> i means
>>>>>>> when i do
>>>>>>> Content.by_author(self).each {|content|
>>>>>>>          content.author_name = self.name;
>>>>>>>          content.save(bulk=true)
>>>>>>>       }
>>>>>>>
>>>>>>> i don't sure that all contents will updated may be only 5 and
then
>>>>>>> process crushed.
>>>>>>>
>>>>>>> 2010/4/8 Andrew Melo <andrew.melo@gmail.com>:
>>>>>>>> On Thu, Apr 8, 2010 at 12:53 PM, faust 1111 <faust451@gmail.com>
wrote:
>>>>>>>>> What difference?
>>>>>>>>> if do
>>>>>>>>> Author
>>>>>>>>>  after_save
>>>>>>>>>     if name_changed?
>>>>>>>>>        Content.by_author(self).each {|content|
>>>>>>>>>           content.author_name = self.name;
>>>>>>>>>           content.save(bulk=true)
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> or i start backend process to track Author _changes.
>>>>>>>>>
>>>>>>>>> This code not guarantee that all contents will updated.
>>>>>>>>
>>>>>>>> I don't get your question. You asked how to make sure that
you could
>>>>>>>> change a number of documents consistently, we suggested that
you watch
>>>>>>>> _changes to catch any silly race conditions. Then, you told
us you
>>>>>>>> didn't need to use _changes, but you were worried that things
would be
>>>>>>>> inconsistent.
>>>>>>>>
>>>>>>>> Even with your code above, you get a race condition (if I
understand
>>>>>>>> your ruby right, I don't know ruby much at all). Something
could
>>>>>>>> happen between when you check to see if a document needs
to be changed
>>>>>>>> and the actual change occurs. Then, you're gonna get a conflict
and
>>>>>>>> have to write up the logic to handle that intelligently.
>>>>>>>>
>>>>>>>> best,
>>>>>>>> Andrew
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2010/4/8 Andrew Melo <andrew.melo@gmail.com>:
>>>>>>>>>> On Thu, Apr 8, 2010 at 12:29 PM, faust 1111 <faust451@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>> I can catch changes in my app before save author,
may be backend
>>>>>>>>>>> process is surplus in my case.
>>>>>>>>>>> i need consistent, when i update author name
i must know that all
>>>>>>>>>>> contents with author was updated success.
>>>>>>>>>>
>>>>>>>>>> Then their suggestion of watching _changes works
for you. Start
>>>>>>>>>> watching _changes. Make all your changes to the documents'
authors.
>>>>>>>>>> Any changes that come through on _changes, make sure
they have the
>>>>>>>>>> proper author. Keep watching _changes until you're
sure that nobody
>>>>>>>>>> has stale data they're waiting submit.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2010/4/8 Zachary Zolton <zachary.zolton@gmail.com>:
>>>>>>>>>>>> I suggest you check out the _changes API:
>>>>>>>>>>>> http://books.couchdb.org/relax/reference/change-notifications
>>>>>>>>>>>>
>>>>>>>>>>>> Basically, if you have doc types A &
B, where B maintains a denormed
>>>>>>>>>>>> bit of A, then you can watch the _changes
feed in a backend process.
>>>>>>>>>>>> When an A gets updated, hit a view of all
B's related to that
>>>>>>>>>>>> particular A, and update the dernomed data.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 8, 2010 at 10:20 AM, faust 1111
<faust451@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>>> Hi guy's
>>>>>>>>>>>>> I return back to my problem with denormalization.
>>>>>>>>>>>>>
>>>>>>>>>>>>> is it possible to keep consistent when
apply denormalization?
>>>>>>>>>>>>> For example
>>>>>>>>>>>>> Content
>>>>>>>>>>>>>   have author (we store author name
and id in Content)
>>>>>>>>>>>>>
>>>>>>>>>>>>> When author name changed(that's happens
not frequently)
>>>>>>>>>>>>> i need find all content belong to this
author and update author name
>>>>>>>>>>>>> but what if this operation not finished
(not all docs was updated)
>>>>>>>>>>>>>
>>>>>>>>>>>>> What i can do in this case?
>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> --
>>>>>>>>>> Andrew Melo
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --
>>>>>>>> Andrew Melo
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
>



-- 
--
Andrew Melo

Mime
View raw message