couchdb-replication mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@couchbase.com>
Subject Re: rev hash stability
Date Sat, 18 Oct 2014 02:46:48 GMT
On Fri, Oct 17, 2014 at 4:17 PM, Jens Alfke <jens@couchbase.com> wrote:

>
> > On Oct 17, 2014, at 2:22 PM, Brian Mitchell <brian@standardanalytics.io>
> wrote:
> >
> > Giving revs meaning outside of this scope is likely to bring up more meta
> > discussion about the CouchDB data model and a long history of
> > undocumented choices which only manifest in the particular
> > implementation we have today.
>
> That does appear to be a danger. I'm not interested in bike-shedding; if
> the Apache CouchDB community can't make progress on this issue then we can
> discuss it elsewhere to come up with solutions. I can't speak for Chris,
> but I'm here as a courtesy and because I believe interoperability is
> important. But I believe making progress is more important.
>
>
My original motivation for raising the issue is, I expect to be writing an
integration suite to make sure the Couchbase rev generators on various
platforms all give the same answer. I'm hoping someone on this list has
thought more about the problem than me, so when we do move our stuff to a
uniform approach, it has at least a chance of being appealing to other
implementations.

But I do expect that there will always be cases where rev bodies are
random, and that replicators will always be able to handle that just fine.
(Except in the case Jens raises, which is akin to if someone was testing
the validity of rev hashes in a custom validation function -- that is, it'd
probably be an option you can turn on if you need unforgeability.)

Chris


> Back to the matter at hand: experience from a long line of P2P systems
> (from FreeNet onwards) shows the value of giving pieces of distributed
> content their own unique and unforgeable IDs. CouchDB-style revision IDs
> partly meet this need, except that:
> (a) there are interoperability issues because every implementation has its
> own algorithm for generating the IDs;
> (b) none of the current ones are very unforgeable because they use the
> broken MD5 hash instead of something like SHA256;
> (c) the unforgeability isn't verified because the replicator doesn't check
> that a revision's ID matches its contents.
>
> At some point — Couchbase would like to build P2P systems in the future —
> we may need to take this more seriously, at which point it becomes
> necessary to have a canonical rev-ID generation algorithm which is enforced
> by the replicator. That algorithm will need to be standardized for
> interoperability purposes, since otherwise two implementations would reject
> each other's revisions as forgeries.
>
> That's why I see this issue as important.
>
> —Jens




-- 
Chris Anderson  @jchris
http://www.couchbase.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message