couchdb-replication mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Alfke <j...@couchbase.com>
Subject Re: rev hash stability
Date Fri, 17 Oct 2014 23:04:21 GMT

> On Oct 17, 2014, at 12:48 PM, Brian Mitchell <brian@standardanalytics.io> wrote:
> 
> 1. Any revs that match between two documents should be assumed to be the same
> revision of the document. This is important outside of optimization (N-way replications
> for example).

Again, I'm not sure what you mean by "match". (Or by "between two documents" … none of this
makes sense if there's more than one document involved. Was that a typo?)

If by "match" you mean "equal contents [aside from the _rev property]", I don't think any
current implementation does what you said. First off, there's a very important piece of information
that's not stored in a revision: its parent revision. Two revisions with identical contents
but different parents should never be considered equal.

Even if you accept that, I don't think it's feasible to merge two revisions with different
IDs but equal contents. A minor bit is that the replicator will have extra overhead in comparing
the new revision against existing peers to determine whether they're equal. But the big problem
is that if it is equal to an already stored revision … then what? You can't throw one of
them away, and there's no mechanism to record their equality, so keeping them both doesn't
make sense.

> 3. Optionally: revisions can be generated deterministically to allow idempotent
> operations. This is really important for clusters (non-optional in practice) but
> has very little important for PouchDB.

No, it's very important for a distributed system, of which PouchDB is a likely client. The
example I gave before, of two people checking off the same to-do list item, is an example
of an idempotent operation. I could come up with a lot more.

—Jens
Mime
View raw message