incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Marca <jma...@translab.its.uci.edu>
Subject Re: Replication question
Date Mon, 01 Mar 2010 18:21:46 GMT
On Mon, Mar 01, 2010 at 08:33:53AM -0800, J Chris Anderson wrote:
> 
> On Feb 28, 2010, at 11:01 PM, James Marca wrote:
> 
> > On Mon, Mar 01, 2010 at 10:29:03AM +1300, Blair Nilsson wrote:
> >> It shouldn't be surprising though, the target database may already
> >> have records in it that would change the results, which would be
> >> difficult to detect without running the map on all the data that was
> >> already there. Also it is quite likely that it would take longer to
> >> replicate all the view data then regenerate it. Hell, you may never
> >> use that view on the replicated end so transferring the processed data
> >> is a waste anyway.
> >> 
> > 
> > Okay, but I still think it is a bug.  Aside from specific document
> > conflicts, the rules for views are that identical input equals
> > identical output.  So the documents that replicate successfully from
> > one db to the other should produce identical output
> > from identical view code.  I don't know much about b-trees, but I
> > suspect there are algorithms to merge two b-trees efficiently.
> > If that is true, then if the view is already computed then isn't
> > the laziest response just to copy it over and merge it with the
> > current view, even if you have to somehow caveat the replication
> > conflicts.  
> 
> it wouldn't be wrong to do this, but we certainly don't do it yet... complexity. time.
we'll get there.


Yes, I apologize "bug" is the wrong word, "feature request" is what I
meant to say. I wish I could tackle this myself, but my time is no
longer my own these days.

> 
> 
> > 
> > CouchDB seems intelligent enough in the view generation to notice when
> > docs have changed and only compute views on those docs, so why can't
> > similar code get thrown at this?
> > 
> > As to whether or not copying the views is useful or not, I think it is
> > application-specific.  I've got a couple terabytes of data waiting in
> > the pipe to get processed this way, so actually, in my use case,
> > re-running the view is out of the question, and re-using views is the
> > height of efficiency.  And finally, I've only got two views (two
> > design documents) and I'm certainly going to be using them!
> > 
> 
> One thing you can do, is merge the view queries without merging the databases. As long
as you have identical view definitions and you can bridge the nodes with something like CouchDB
Lounge smartproxy, you should be good.

I just might try that.  Lounge looks like it's getting lots of
developer attention.  All I really want in the short term is to hide
merging the view queries from the client.  In the longer term though
I'd love to physically stick a couchdb server on data collection boxes
in the field, so that collecting data becomes a simple pull
replication.

Regards, 

James Marca

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Mime
View raw message