couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <dam...@apache.org>
Subject Attachment level replication
Date Tue, 14 Jul 2009 22:06:35 GMT
I've just finished the deterministic revs work, and I've made changes  
to the attachment meta data is stored. I figure, since I'm in there  
changes things, I should go ahead and make a change to allow for  
incremental attachment replication,so we only replicate the binary  
attachments (which can be very large) that have changed, instead of  
all the attachments any something in the document changes.

Currently when we replicate a changed document, we replicate all the  
attachments, regardless if they've changed.

It looks something like this:

Replicator get the revs of the latest changes since seq N from the  
source.
Replicator asks the target "do you have these revs?"
Target responds, "here are the revs I am missing"
Replicator asks source, give me these revs I am missing or latest  
revisions of those revs.
Source returns the docs and attachment info.
Replicator writes documents and attachments to target


For incremental replication, CouchDB would now track which revision a  
document was edited in, storing the revision number along with the  
attachment metadata. When an attachment is updated, the revision  
number is updated along with it. And since we hash the attachments  
contents, we can be smart an updated it's rev number it only when it  
actually changes.

New Replication:

Replicator get the revs of the latest changes since seq N from the  
source.
Replicator asks the target "do you have these revs?"
Target responds, "here are the revs I am missing, and these are the  
latest revs I do have"
Replicator asks source, give me these revs I am missing or latest  
revisions of those revs.
Source returns the docs and attachment info.
Replicator figures which, if any, earlier revisions of the doc already  
exist on the target, using the "latest revs" the target gave us.
Replicator writes to target the documents and only the attachments  
that have changed since the latest revisions already on target.

These would require some new apis in addition to changes in the  
replicator, but we can keep around the old apis so old versions can  
still replicate to us.

Feedback please.

-Damien

Mime
View raw message