couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <>
Subject Re: Replication and forms of weak consistency
Date Mon, 16 Feb 2009 11:15:32 GMT
On Mon, Feb 16, 2009 at 6:24 AM, Antony Blakey <> wrote:
> Please let's stop using the word 'transactional'. Monotonic Writes requires
> nothing more 'transactional' then CouchDB already has e.g. stable storage.
> The word 'transaction' is commonly used to mean user-level ACID semantics,
> which the neither the Bayou nor PRACTI models provide.
> On 16/02/2009, at 3:55 PM, Chris Anderson wrote:
>> So it seems as though, when a long history is replicated under your
>> model (interleaving many different client updates) we would end up
>> sending a lot more data over the wire under your proposed model.
> With the tradeoff that you get Monotonic Writes. Whether you see a lot more
> data depends on the frequency of Replication wrt writes, and the
> distribution of writes. Clustered writes with isolation group optimization
> (i.e. protocol, not user-initiated) would end up sending little, if any more
> data than would currently be sent. Furthermore, Monotonic Writes might allow
> you to do differential encoding of subsequent revisions. This could be a
> fantastic win that would reduce the amount of data sent, even compared to
> the current protocol. Especially for attachments.
>> In order to ensure that the isolation group stays together, even should
>> replication fail before completion, we'd have to send the latest
>> doc-rev for every doc touched in each isolated doc group.
> In order to get Monotonic Writes you need to do that, and it's independent
> of isolation groups. Isolation groups are a feature that allows you to send
> *less* data. Exposing them to the user is entirely another question.
>> In the current system we just send the latest non-conflicted rev or
>> all the conflict revs is they exist. It makes for a lot less data on
>> the wire. (Correct me if I'm wrong.)
> Correct, although incremental replication creates states that don't provide
> a Monotonic Write guarantee.
>> Your story about comments being replicated without their assocaited
>> posts is a good example of the counter-intuitive things that can
>> happen when replication fails before completion. Thanks for that.
> The current replication implementation, not replication per se.
>> I think these questions are interesting, I really do. However, in my
>> mind, what makes CouchDB relaxing, is that we're not trying to be
>> ambitious on the transactional guarantees front. So far, we've tried
>> to give only the guarantees we know we can afford to give, and
>> concentrate on getting them right.
> It isn't clear that the tradeoff needs to be forced. A system that provides
> Monotonic Writes can easily optimize for bandwidth, either adaptively or via
> configuration, but the reverse is not true.
> One example of adaptive optimization is automatically increasing the size of
> the isolation groups depending on the measured performance characteristics
> of the channel, and the size of the data.
> You can configure a system that can provide Monotonic Write guarantees to
> not do so.
>> Robert's point that much of this can be implemented on top of CouchDB
>> is an interesting one. If it is indeed the case, then the question
>> becomes whether clients or the database should be responsible for
>> providing the transactional API.
> I'm still processing Robert's point in the context of the papers, but I'm
> not sure that it's true that it can be done without modification to CouchDB.
> It may not be practical to carry session-long version vectors in a
> light-weight client. I'm more certain that it can't be done in the context
> of Partial Replication. In any case, this can be done once, efficiently, in
> the server, rather than ineficiently (if at all) in a lightweight client.

I agree it may not be generally practical to carry a version vector in
the client (specifically in a 2k cookie) but it may be practical in my
specific case. I don't (yet) see why partial replication is a
particular problem, though. In my case there are no relations between
documents, it's just desirable that clients can read their writes and
see monotonic writes for the duration of a session. That the session
guarantees are violated when a server fails, and clients fail over to
a replica, seems unavoidable unless all replicas proceed lock-step
(2pc, 3pc, paxos, etc). So, the paper appears to show a means for
clients and servers to 'paper over the cracks'. In my view, the
client's version vector would let me block a user request while the
newly chosen replica catches up to the appropriate level, or returns
an error if it can't or if replication fails or times out.

> In the Bayou papers the sessions need to be persistent - the Bayou context
> is an explicit client-server model with a persistent client.

My read of the paper, which I grant could be flawed, is that sessions
can be persistent (as they are just a relatively small version vector)
but I saw no obligation to make them so. That is, it's reasonable that
a client starts a session, does a set of operations, and then goes
away. All the consistency paper seems to mandate is that clients can
be specify the guarantees they wish and the algorithm informs them
when or if those guarantees are violated. It's the implementation's
job to reduce the times that violations happen (more frequent and
faster replication or synchronization, for example). That's why I feel
it's much lighter than you do, but I happily concede I may be wrong.

> Antony Blakey
> --------------------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
> Reflecting on W.H. Auden's contemplation of 'necessary murders' in the
> Spanish Civil War, George Orwell wrote that such amorality was only really
> possible, 'if you are the kind of person who is always somewhere else when
> the trigger is pulled'.
>  -- John Birmingham, "Appeasing Jakarta"

View raw message