On Mar 5, 2009, at 7:24 AM, Jan Lehnardt wrote:
>
> On 5 Mar 2009, at 07:31, Paul Davis wrote:
>
>> On Wed, Mar 4, 2009 at 8:34 PM, Adam Kocoloski <adam.kocoloski@gmail.com
>> > wrote:
>>> Hi folks, we've been running into a problem where multiple
>>> replications with
>>> the same source and target are running simultaneously. This
>>> introduces
>>> quite a lot of unnecessary network traffic and causes real
>>> problems with
>>> update collisions on the local replication history documents. If
>>> replicator
>>> A updates the source doc and replicator B updates the target doc,
>>> subsequent
>>> replications will decide that a full replication is necessary.
>>>
>>> I have some ideas about how to ensure only one is running at a
>>> time (more on
>>> that in a separate mail), but I'd like some feedback on how to
>>> handle the
>>> second..Nth request. Let's call the initial POST to _replicate
>>> "A" and the
>>> second POST "B":
>>>
>>> Option 1 -- Respond to B with the results from A
>>> This option works fine if the source is remote. However, if the
>>> source is
>>> local, the replication started by A will be missing updates to the
>>> source DB
>>> that occurred between A and B. B may be surprised by that result.
>>>
>>> Option 2 -- Grab an updated DB and continue the replication
>>> This option will include updates to the source that occurred
>>> between A and B
>>> in the response to both requests.
>>>
>>> Option 3 -- Respond to A, then trigger another replication for B
>>> In this case we wait till the replication started by A has
>>> completed, then
>>> do an incremental one and respond to B with the results of that
>>> incremental.
>>>
>>> I think I'd vote for 3. Cheers, Adam
>>>
>>>
>>
>> If I follow this correctly, the issue is, "POST to _replicate, a
>> second POST to _replicate occurs before the first request finishes"
>> (with the same source/target info).
>>
>> My knowledge of replication is only cursory, but I could also see:
>>
>> Option 4:
>>
>> Same as views, we wait for replication to finish and return the same
>> result to all clients that made a request.
>
> I understand this and Adam's option 3 to be the same. What am I
> missing? :)
No, not quite. In Option 3 the two requesters get different
responses. A gets the result of the original request, B gets the
result of the replication triggered automatically after the first one
that replicates any updates to the DB which happened during the first
pass. If no updates occurred, B will receive the result of the first
replication.
Paul's Option 4 is more like Options 1 and 2, where A and B get
identical responses. The difference between 1 and 2 is just whether
new updates get included in that response.
Whew.
>> Option 5:
>>
>> Return an error on B that says, "Yeah, yeah. Already on it."
>
> This would make replication behave a bit like compaction.
Sort of, in that additional triggers are no-ops. Option 1 also has
that behavior.
> I think I like 3/4 best.
>
> Cheers
> Jan
> --
Best, Adam
|