couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <>
Subject Re: couchdb transactions changes
Date Mon, 09 Feb 2009 07:40:09 GMT

On 09/02/2009, at 5:45 PM, Paul Davis wrote:

> A write takes the most recent status of the database. It performs the
> write using the append only semantics of editing btrees. When the
> write completes it uses an atomic write to the db header. This means
> that no matter what, new readers get a consistent view of the entire
> database.

The atomic write of the root is the commit. A reader, by virtue of an  
atomic read of the header, sees a commit point.

> As I read your emails you seem to be assuming that CouchDB could walk
> back through the valid database commits. As far as I understand, this
> is not possible given the current database format. Furthermore, making
> it possible would require a large amount of engineering to accomplish.

No, you don't have to walk the commits. There is no record of commits,  
except in as much as you might have a number of different roots in use  
by concurrent processes at any given time e.g. multiple commit points  
are an ephemeral thing. There is only every *one* durable commit point.

> AFAIK, we supported inter-document consistency to a single node. Now
> that we're more seriously contemplating multi-node setups its becoming
> apparent that the single the atomicity was a special case when it can
> be violated by something as simple as a replication.

Well, I believe I've shown that a simple change can make replication  
(optionally) respect MVCC commit points, involves very little change  
to the source algorithm, doesn't impact the current semantics at all  
unless you wish it to, and works on a per-replication request basis.

This is orthogonal to the problem of cluster-ACID, which is also do- 
able, but I'm trying to work through this replication issue right now.

> I'm uncertain by what you mean by 'replication model'.

According to my use-case list e.g whether replication is exclusive  
with normal operation, and whether it can result in conflict (i.e.  
Single master deployments).

> My current
> understanding of replication is that it violates the promises of
> _bulk_docs. As Damien mentions further down, to support what you're
> asking for, you more or less need to repeat all _bulk_docs calls to
> your central server in app code. This is quite possible. If enough
> other people chimed in and voiced an opinion that this is something
> they are interested in, I can see it as a valid reason for supporting
> _bulk_docs like functionality in the future.

I don't want to replicate reified transactions. The current state of  
the source wrt. an MVCC commit point is all that is required iff your  
MVCC commit point is exposed to the user. You can build local  
transactions and a useful (NOT generic) form of distributed  
transactional consistency on top of that.

> If it's trivial, then post a patch to JIRA.

We're discussing a proposed patch, against which this idea would be a  
patch :)

I'm just addressing the idea that you can't compact the source  
database while replication occurs if replication is made MVCC aware.

> The thing is, your interpretation is asking CouchDB to prove the CAP
> theorem incorrect.

Not at all. I'm saying that there are application/deployment models  
and use-cases that

a) distinguish between replication and normal operation e.g. thesystem  
moves from normal, conflict free operation, to replication, to  
conflict-resolution, back to normal operation; and/or

b) have a model that doesn't generate replication conflicts e.g.  
single-master replication doesn't fall under CAP.

Antony Blakey
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

One should respect public opinion insofar as is necessary to avoid  
starvation and keep out of prison, but anything that goes beyond this  
is voluntary submission to an unnecessary tyranny.
   -- Bertrand Russell

View raw message