couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <antony.bla...@gmail.com>
Subject Re: Fail on a simple case on replication
Date Tue, 24 Feb 2009 02:21:22 GMT

On 24/02/2009, at 12:15 PM, Chris Anderson wrote:

>>> Would it be overly difficult to just add in the ability to keep a  
>>> full rev
>>> history based on a config setting?
>
> This would be a pretty big change. As Antony says, once you go down
> that path a little, you end up at something that is not really much
> like Couch.

I don't want to re-open a dead issue, but to clarify this - there are  
other models of replication that provide stronger weak-consistency  
guarantees - I urge you to read a few Bayou papers if you are  
interested. Using such replication would be very close to Couch. So I  
don't agree with the strength of Chris's comment.

The issue however, is that Couch's identity is, and has always been,  
largely determined by it's replication model. There's so much more to  
Couch that is independent of that, such as map/reduce views, forms,  
futon, an HTTP API, JSON etc, that it's not immediately obvious that  
it's the *replication model* that makes this product 'CouchDB'. The  
project founder and the PMC, are all committed to that replication  
model, which is derived from Notes.

You can add all of the other Couch features, and in fact reuse all of  
the Couch code, with a different replication model, but it's unlikely  
it would be accepted into the Couch code base. If you want that, you  
need to fork and call it something different (which is what I'm  
doing). It's important to note however that the Couch replication  
model has some characteristics that cannot be achieved using any  
stronger form of consistency. In fact, technically speaking, Couch  
provides coherence, but NO consistency.

Given all of that, it would be good to have a very clear 'What is  
Couch' that emphasizes the primacy of the replication model (and it's  
implications, both pro and con), because none of the other things IMO  
are as central to the identity, as consequential, or as confusing  
(except maybe reduce/re-reduce) as the operational semantics of the  
replicational model.

As an aside to this (and I'm not being bolshy), looking further ahead,  
Eventual Consistency, which seems to be promoted as an article of  
faith, is not *strictly* achievable in a partial replication  
environment. Achieving Eventual Consistency is also dependent on some  
other constraints, so depending on your deployment model, it can be  
more theoretical than practical. At the end of the day however,  
dealing with non-Monotonic Writes subsumes dealing with Eventual  
Consistency in all but asymptotic senses.

These are all points that I think should be made clearly and up front  
in the documentation, because a failure to understand Couch's  
replication model, and the implications for applications, both pro and  
com, will IMO lead to failures that will be blamed on Couch, but are  
in fact due to misunderstanding. You don't want a 'Couch is a piece of  
shit' meme to establish. IMO the bulk of Couch users will not think  
this through themselves, because they will be tool users, not tool  
builders.

> There's yet to be a really clear reference for how to do
> application-versioned documents in CouchDB. Hopefully we'll address
> the topic in the book, but we haven't gotten that far yet.
>
> The way I see it, the salient options are:
>
> A) leave it as _rev and answer the versioning question every week  
> forever
> B) rename it to _mvcc or _lock or _token or something else that
> doesn't confuse people
>
> The main drawback of B is that when we start renaming _rev, someone
> else comes along and tries to take the opportunity to change _id, or
> otherwise change the whole system. If we can stick to just renaming to
> something clearer, I'm happy to go ahead with this.

Orthogonally, I still think the id and rev should be wrapped in a  
_meta tag, but modulo that ...

It's not a _lock. Saying it's a _token has nothing to do with it's  
function - it would be like calling a car a 'construct of metal'. It's  
not _mvcc because that's the name of a technique, not a thing.

Maybe _mvcc_commit_id - although in the current implementation it  
isn't, it philosophically is and could be implemented that way. But  
really, it is a document version/revision identifier. Maybe put  
'couch' in there to emphasize the internal nature of it e.g.  
'_couch_rev_id' i.e. something which, at the limit might be  
'_couch_private_revision_id_which_you_should_treat_as_opaque'.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The intuitive mind is a sacred gift and the rational mind is a  
faithful servant. We have created a society that honours the servant  
and has forgotten the gift.
   -- Albert Einstein



Mime
View raw message