couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <>
Subject Re: handling simultaneous identical replications
Date Thu, 05 Mar 2009 12:24:08 GMT

On 5 Mar 2009, at 07:31, Paul Davis wrote:

> On Wed, Mar 4, 2009 at 8:34 PM, Adam Kocoloski < 
> > wrote:
>> Hi folks, we've been running into a problem where multiple  
>> replications with
>> the same source and target are running simultaneously.  This  
>> introduces
>> quite a lot of unnecessary network traffic and causes real problems  
>> with
>> update collisions on the local replication history documents.  If  
>> replicator
>> A updates the source doc and replicator B updates the target doc,  
>> subsequent
>> replications will decide that a full replication is necessary.
>> I have some ideas about how to ensure only one is running at a time  
>> (more on
>> that in a separate mail), but I'd like some feedback on how to  
>> handle the
>> second..Nth request.  Let's call the initial POST to _replicate "A"  
>> and the
>> second POST "B":
>> Option 1 -- Respond to B with the results from A
>> This option works fine if the source is remote.  However, if the  
>> source is
>> local, the replication started by A will be missing updates to the  
>> source DB
>> that occurred between A and B.  B may be surprised by that result.
>> Option 2 -- Grab an updated DB and continue the replication
>> This option will include updates to the source that occurred  
>> between A and B
>> in the response to both requests.
>> Option 3 -- Respond to A, then trigger another replication for B
>> In this case we wait till the replication started by A has  
>> completed, then
>> do an incremental one and respond to B with the results of that  
>> incremental.
>> I think I'd vote for 3.  Cheers, Adam
> If I follow this correctly, the issue is, "POST to _replicate, a
> second POST to _replicate occurs before the first request finishes"
> (with the same source/target info).
> My knowledge of replication is only cursory, but I could also see:
> Option 4:
> Same as views, we wait for replication to finish and return the same
> result to all clients that made a request.

I understand this and Adam's option 3 to be the same. What am I  
missing? :)

> Option 5:
> Return an error on B that says, "Yeah, yeah. Already on it."

This would make replication behave a bit like compaction.

I think I like 3/4 best.


View raw message