couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Conflict resolution protocol
Date Wed, 23 Nov 2011 00:37:09 GMT
On Tue, Nov 22, 2011 at 5:37 PM, Alex Besogonov
<alex.besogonov@gmail.com> wrote:
> I'm trying to understand the conflict resolution protocol of CouchDB
> (the selection of the winning revision). So far I understand that
> CouchDB does essentially this:
>
> 1) Finds the revision with the highest number and if there are no
> other revisions with the same number then it is declared the winner.
> 2) If there are several revisions with the same revision number, then
> the one with the lowest revision ID is selected (Erlang's string
> comparison function is used to find the lowest string).
>

I'd avoid using the term "revision number" in this case because it
denotes some sort of serial incrementing of a value. I'd also avoid
calling it "conflict resolution" as it never attempts to resolve
anything, it only identifies when one exists.

The basic algorithm can be described as: "When multiple leaves in the
revision tree exist in an undeleted state, there is a conflict. To
choose which conflict 'wins' we first look for the revision with the
number of edits (ie, deepest path from root). If multiple revisions
have an equal depth we break the tie by arbitrary sorting criteria on
the revision."

It's actually a fairly simple algorithm with a weird implementation
and, as you have found out, little to no documentation outside a few
snippets here and there.

> After the winner is found everything else is straightforward -
> revision trees are aligned, conflicting revisions are stored, extra
> revisions are stemmed, etc.
>

I remember thinking that before tearing my hair out over COUCHDB-1265.
:D But yeah, once the general description of the algorithm exists its
not impossible to read though the implementation and finally see it
snap into focus.

> I'm going to document all of my findings for the future developers who
> might be interested to use CouchDB with other systems.
>

That would be awesome. I've been long meaning to rewrite the
replication algorithm as documented Python code so that it would be
more tenable for non-Erlangers to read. At it's core, its a rather
simple thing but requires that people learn an unfamiliar language to
navigate some of the finer details.

Thanks for the effort

Mime
View raw message