couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Marca <>
Subject Re: Replication question
Date Mon, 01 Mar 2010 07:01:51 GMT
On Mon, Mar 01, 2010 at 10:29:03AM +1300, Blair Nilsson wrote:
> It shouldn't be surprising though, the target database may already
> have records in it that would change the results, which would be
> difficult to detect without running the map on all the data that was
> already there. Also it is quite likely that it would take longer to
> replicate all the view data then regenerate it. Hell, you may never
> use that view on the replicated end so transferring the processed data
> is a waste anyway.

Okay, but I still think it is a bug.  Aside from specific document
conflicts, the rules for views are that identical input equals
identical output.  So the documents that replicate successfully from
one db to the other should produce identical output
from identical view code.  I don't know much about b-trees, but I
suspect there are algorithms to merge two b-trees efficiently.
If that is true, then if the view is already computed then isn't
the laziest response just to copy it over and merge it with the
current view, even if you have to somehow caveat the replication

CouchDB seems intelligent enough in the view generation to notice when
docs have changed and only compute views on those docs, so why can't
similar code get thrown at this?

As to whether or not copying the views is useful or not, I think it is
application-specific.  I've got a couple terabytes of data waiting in
the pipe to get processed this way, so actually, in my use case,
re-running the view is out of the question, and re-using views is the
height of efficiency.  And finally, I've only got two views (two
design documents) and I'm certainly going to be using them!


James Marca

This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

View raw message