incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <dam...@apache.org>
Subject Re: dirty reads - update strategies
Date Thu, 13 Nov 2008 18:28:59 GMT

On Nov 13, 2008, at 1:10 PM, ara.t.howard wrote:

>
> On Nov 13, 2008, at 10:39 AM, Damien Katz wrote:
>
>> My answer is "Don't do that". Values in documents shouldn't depend  
>> on values in other documents, that's a better fit for a relational  
>> or OO DB. In your example though, CouchDB's views could be used to  
>> compute the sums.
>
> i don't think that's realistic.  consider something like the  
> following:
>
> let's say we write a publishing system, users can create documents  
> with content and tags.  at the end of the month the editor is going  
> to write a summary of the content from that month, obviously this  
> summary should be tagged with the union of the tags from all  
> summarized content - for later searching.  regardless of whether we  
> store the tags inside the document or outside of it we have quite a  
> task - we need to get a consistent read of all content for the  
> month, with all it's tags, in order to properly construct the  
> summary document with it's aggregate tags. this isn't strict  
> dependence - it's merely a read/write consistency issue which nearly  
> any application is going to face.  we can argue that it's not  
> important that the summary of tags exactly mirrors the tags of it's  
> constituent parts, but that kind of thinking results not in an  
> information store, but a collection of valueless data.

CouchDB views are a consistent snapshot of the database, your reports  
are generated from the views. The view APIs are the place to look for  
better reporting capabilties.

>
>
> anyhow, i think it's important to be able to agree upon best  
> practices for this kind of operation.  saying that values shouldn't  
> depend on values in other documents is quite a statement - it means  
> couch should no be used for any information store where the  
> information value needs to grow recursively.

What I mean is you should never depend on the accuracy of the computed  
values in documents that are based on other documents. Particularly  
with replication.

> in my case we're modeling financial information which gets processed  
> in increasingly sophisticated ways - where documents are inputs to  
> processes which produce other documents.  i can't think of an  
> application that does not do the same thing: a blog comment depends  
> on the blog post, a 'friends list' depends on the users, etc.

>
>
> are you referring to 'values' as different from 'ids' ?

Yes, I mean values as computed values.  The main post shouldn't be  
updated with a comment count or anything computed like that. It's fine  
if comments have a reference to their parent, and its fine if the  
comments are tagged as children of the post. This way, when the main  
post is opened, the comment count can be computed from a view, or when  
viewing a comment, the user is also shown the parent, and maybe  
subcomments if its a threaded discussion.

-Damien

>
>
> kind regards.
>
> a @ http://codeforpeople.com/
> --
> we can deny everything, except that we have the possibility of being  
> better. simply reflect on that.
> h.h. the 14th dalai lama
>
>
>


Mime
View raw message