couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <>
Subject Re: proposed replication rev history changes
Date Mon, 09 Feb 2009 05:31:07 GMT
Ok, thanks for the clarification.  I don't see any major downsides  
beyond the ones you already mentioned. The inability to replicate  
between versions is a bit of a bummer -- I'd want to at least look  
into a bridge that lets old servers replicate to new ones.

Your point about reducing the chance of collision is a good one,  
especially since Couch is using a 32 bit sample space for revision  
IDs.  The probability of zero collisions between any two revisions in  
a given document history is

N!/((N-M)! * N^M)

with N = 2**32 and M = "max rev history".  With M = 128, that  
probability drops to 0.999998.  In a 400k document DB where each doc  
has the max number of revisions it's likely that at least one has a  
duplicate rev.  That's no good.  I think we could eventually see  
transient cases of revisions being skipped by the replicator with the  
trunk code.

Adding the revseq doesn't reduce the chances of a duplicate rev, but  
it does mean that replication won't accidentally match revisions from  
different revseqs.  Instead, the concern would be that two different  
servers would generate the same revision ID from different updates at  
the same revseq.  It's a concern only for multi-master setups, and  
even then each document that had been updated on both source and  
target would only have a 1/N chance of being skipped due to an  
accidentally matching revision.  I guess it would happen once every 3  
billion times or so.

Or Couch could switch to a 64 bit space for the revision IDs ;-)


On Feb 8, 2009, at 2:40 PM, Damien Katz wrote:

> I don't think it's strictly necessary, but it makes merging new  
> edits simpler and it significantly reduces the chances of collisions  
> between revision ids, there is less ambiguity. What downsides do  
> your see?
> -Damien
> On Feb 8, 2009, at 2:28 PM, Adam Kocoloski wrote:
>> Hi Damien, it seems to me that you're conflating two separate  
>> issues.  I agree that the revision history should be trimmed, and  
>> that this will potentially introduce spurious conflicts when two  
>> servers have no shared history for a document.  I don't see how  
>> this change by itself requires the addition of a revseq to the JSON  
>> revision format.  Is it really required?
>> Adam

View raw message