couchdb-replication mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Mitchell <br...@standardanalytics.io>
Subject Re: rev hash stability
Date Fri, 17 Oct 2014 19:48:52 GMT

> On Oct 17, 2014, at 3:41 PM, Jens Alfke <jens@couchbase.com> wrote:
> 
> 
>> On Oct 17, 2014, at 12:15 PM, Brian Mitchell <brian@standardanalytics.io> wrote:
>> 
>> Simply put: if and only if the revs match we should assume some optimism just like
we
>> do with things like atts_since. There’s already a lot of trust between two nodes
for replication
>> and we should assume that matching revs were either unique (or random) or based on
some
>> deterministic property that isn’t likely to collide unless it was an equivalent
operation.
> 
> I'm sorry, I've read this a few times and I can't figure out exactly what your meaning
is. Could you elaborate? Particularly, what does "if the revs match" mean, exactly?
> 
> Also, I don't think your statement "there’s already a lot of trust between two nodes
for replication" is accurate in all cases. You seem to be thinking of a server cluster (a
la BigCouch) but CouchDB-style replication is often used in a more distributed way. Both PouchDB
and Couchbase Lite use replication between servers and clients. A client can be trusted to
be acting on behalf of a user, but not beyond that.
> 
> —Jens

No problem. I probably kept the message too short.

The issue is that requiring revs to match is a bit assuming about the context
different implementations are designed to operate in. The case of the optimization
makes a lot of sense in some cases (clustering for availability being the most
obvious).

This implies there is a contract to how any implementation should treat revisions:

1. Any revs that match between two documents should be assumed to be the same
revision of the document. This is important outside of optimization (N-way replications
for example).

2. Each implementation must be trusted to generate unique revisions.

3. Optionally: revisions can be generated deterministically to allow idempotent
operations. This is really important for clusters (non-optional in practice) but
has very little important for PouchDB.

I’d urge implementations to document what guarantees their revs have but
I would stop short in exposing the implementation (like the digest used or
RNG function) as that is out of scope for the _rev contract for compatible
implementations.

There are many reasons to settle at this level of detail, backwards compatibility
being the most important. The other is that it could allow other sorts of rev
encoding in the future for some implementations (cheaper tree merges being
one thing worth revisiting).

So PouchDB should generate revs that make sense for PouchDB’s implementation.
The contract of how these revs are interpreted shouldn’t constrain it to implementing
the same JSON normalization and digest that others do. Same goes for other Couch’s.

Brian.


Mime
View raw message