couchdb-replication mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: rev hash stability
Date Fri, 17 Oct 2014 20:50:41 GMT
I’m with Chris on this one. The replication protocol should define a portable way of creating
deterministic rev ids while leaving room for random or other schemes where applicable.

On 17 Oct 2014, at 22:47 , Chris Anderson <jchris@couchbase.com> wrote:

> I would never suggest that a random rev or other style rev shouldn't be
> functional/expected. It's just that if you do want to generate the same
> revs as somebody else right now, it's hard. Making it less hard it would be
> good for everyone.
> 
> Chris
> 
> On Friday, October 17, 2014, Brian Mitchell <brian@standardanalytics.io>
> wrote:
> 
>> 
>>> On Oct 17, 2014, at 3:41 PM, Jens Alfke <jens@couchbase.com
>> <javascript:;>> wrote:
>>> 
>>> 
>>>> On Oct 17, 2014, at 12:15 PM, Brian Mitchell <
>> brian@standardanalytics.io <javascript:;>> wrote:
>>>> 
>>>> Simply put: if and only if the revs match we should assume some
>> optimism just like we
>>>> do with things like atts_since. There’s already a lot of trust between
>> two nodes for replication
>>>> and we should assume that matching revs were either unique (or random)
>> or based on some
>>>> deterministic property that isn’t likely to collide unless it was an
>> equivalent operation.
>>> 
>>> I'm sorry, I've read this a few times and I can't figure out exactly
>> what your meaning is. Could you elaborate? Particularly, what does "if the
>> revs match" mean, exactly?
>>> 
>>> Also, I don't think your statement "there’s already a lot of trust
>> between two nodes for replication" is accurate in all cases. You seem to be
>> thinking of a server cluster (a la BigCouch) but CouchDB-style replication
>> is often used in a more distributed way. Both PouchDB and Couchbase Lite
>> use replication between servers and clients. A client can be trusted to be
>> acting on behalf of a user, but not beyond that.
>>> 
>>> —Jens
>> 
>> No problem. I probably kept the message too short.
>> 
>> The issue is that requiring revs to match is a bit assuming about the
>> context
>> different implementations are designed to operate in. The case of the
>> optimization
>> makes a lot of sense in some cases (clustering for availability being the
>> most
>> obvious).
>> 
>> This implies there is a contract to how any implementation should treat
>> revisions:
>> 
>> 1. Any revs that match between two documents should be assumed to be the
>> same
>> revision of the document. This is important outside of optimization (N-way
>> replications
>> for example).
>> 
>> 2. Each implementation must be trusted to generate unique revisions.
>> 
>> 3. Optionally: revisions can be generated deterministically to allow
>> idempotent
>> operations. This is really important for clusters (non-optional in
>> practice) but
>> has very little important for PouchDB.
>> 
>> I’d urge implementations to document what guarantees their revs have but
>> I would stop short in exposing the implementation (like the digest used or
>> RNG function) as that is out of scope for the _rev contract for compatible
>> implementations.
>> 
>> There are many reasons to settle at this level of detail, backwards
>> compatibility
>> being the most important. The other is that it could allow other sorts of
>> rev
>> encoding in the future for some implementations (cheaper tree merges being
>> one thing worth revisiting).
>> 
>> So PouchDB should generate revs that make sense for PouchDB’s
>> implementation.
>> The contract of how these revs are interpreted shouldn’t constrain it to
>> implementing
>> the same JSON normalization and digest that others do. Same goes for other
>> Couch’s.
>> 
>> Brian.
>> 
>> 
> 
> -- 
> Chris Anderson  @jchris
> http://www.couchbase.com


Mime
View raw message