couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <dam...@apache.org>
Subject Re: proposed replication rev history changes
Date Mon, 09 Feb 2009 05:40:46 GMT

On Feb 9, 2009, at 12:31 AM, Adam Kocoloski wrote:

> Ok, thanks for the clarification.  I don't see any major downsides  
> beyond the ones you already mentioned. The inability to replicate  
> between versions is a bit of a bummer -- I'd want to at least look  
> into a bridge that lets old servers replicate to new ones.
>
> Your point about reducing the chance of collision is a good one,  
> especially since Couch is using a 32 bit sample space for revision  
> IDs.  The probability of zero collisions between any two revisions  
> in a given document history is
>
> N!/((N-M)! * N^M)
>
> with N = 2**32 and M = "max rev history".  With M = 128, that  
> probability drops to 0.999998.  In a 400k document DB where each doc  
> has the max number of revisions it's likely that at least one has a  
> duplicate rev.  That's no good.  I think we could eventually see  
> transient cases of revisions being skipped by the replicator with  
> the trunk code.
>
> Adding the revseq doesn't reduce the chances of a duplicate rev, but  
> it does mean that replication won't accidentally match revisions  
> from different revseqs.  Instead, the concern would be that two  
> different servers would generate the same revision ID from different  
> updates at the same revseq.  It's a concern only for multi-master  
> setups, and even then each document that had been updated on both  
> source and target would only have a 1/N chance of being skipped due  
> to an accidentally matching revision.  I guess it would happen once  
> every 3 billion times or so.
>
> Or Couch could switch to a 64 bit space for the revision IDs ;-)

There is nothing preventing larger revs (or even non-integer revs) as  
it's just stored as a string (real efficient I know). The size could  
easily be a server or database setting.

-Damien


>
>
> Adam
>
> On Feb 8, 2009, at 2:40 PM, Damien Katz wrote:
>
>> I don't think it's strictly necessary, but it makes merging new  
>> edits simpler and it significantly reduces the chances of  
>> collisions between revision ids, there is less ambiguity. What  
>> downsides do your see?
>>
>> -Damien
>>
>> On Feb 8, 2009, at 2:28 PM, Adam Kocoloski wrote:
>>
>>> Hi Damien, it seems to me that you're conflating two separate  
>>> issues.  I agree that the revision history should be trimmed, and  
>>> that this will potentially introduce spurious conflicts when two  
>>> servers have no shared history for a document.  I don't see how  
>>> this change by itself requires the addition of a revseq to the  
>>> JSON revision format.  Is it really required?
>>>
>>> Adam
>>>
>>
>


Mime
View raw message