couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <>
Subject Re: couchdb transactions changes
Date Mon, 09 Feb 2009 07:15:48 GMT
On Sun, Feb 8, 2009 at 11:27 PM, Antony Blakey <> wrote:
> On 09/02/2009, at 2:35 PM, Paul Davis wrote:
>> There is no concept of an "MVCC boundary" anywhere in the code that
>> I'm aware of.
> Database updates create an MVCC commit, reads are all wrt an MVCC commit.
> MVCC boundaries e.g. commit points, are a fundamental port of the Couch
> low-level architecture. When _bulk_docs was ACID, they were exposed in the
> user-level API.

My understanding of the code is this:

A write takes the most recent status of the database. It performs the
write using the append only semantics of editing btrees. When the
write completes it uses an atomic write to the db header. This means
that no matter what, new readers get a consistent view of the entire

As I read your emails you seem to be assuming that CouchDB could walk
back through the valid database commits. As far as I understand, this
is not possible given the current database format. Furthermore, making
it possible would require a large amount of engineering to accomplish.

>> I think the bigger point here is that what you're asking for violates
>> a huge swath of assumptions baked into the core of CouchDB. Asking
>> CouchDB to do consistent inter-document writes is going to require you
>> to either change a large amount of internal code or write some very
>> specific app code to get what you want.
> But it already did consistent inter-document writes - the removal of that is
> what this discussion is about.

AFAIK, we supported inter-document consistency to a single node. Now
that we're more seriously contemplating multi-node setups its becoming
apparent that the single the atomicity was a special case when it can
be violated by something as simple as a replication.

>> You may be able to get atomic
>> interdocument updates on a single node, but this is violated if you do
>> so much as try and replicate.
> And 'so much as try and replicate' is the issue, because the replication
> model varies for different use cases. In my previous posts you'll see that
> I'm promoting the idea that the local, exclusive-replication use-case is
> significant, and useful. The are useful models where replication is a
> fundamentally different operation than local use.

I'm uncertain by what you mean by 'replication model'. My current
understanding of replication is that it violates the promises of
_bulk_docs. As Damien mentions further down, to support what you're
asking for, you more or less need to repeat all _bulk_docs calls to
your central server in app code. This is quite possible. If enough
other people chimed in and voiced an opinion that this is something
they are interested in, I can see it as a valid reason for supporting
_bulk_docs like functionality in the future.

But this is not replication and is something that I assume changes the
way your app works.

>> IMO, it would be better to not support _bulk_docs for exactly this
>> reason. People that use _bulk_docs will end up assuming that the
>> atomic properties will carry over into places it doesn't actually get
>> passed on to.
> But it can for local operations, and replications conflicts can be dealt
> with separately from normal operation.
>> It occurs to me that once you get to the point of writing source and
>> target database locking, you no longer need _bulk_docs. You'd have
>> enough code to do all the atomic interdoc writes you need.
> Only by giving up all local concurrency. Locking is only wrt. replication
> vs. local operation. And I think the most recent emails are showing that
> source locking is not as black-and-white as you think - it's only wrt
> compaction, and even then  I think it's restricted to a requirement to no
> compact past the MVCC state being used by the replication process, which IMO
> is a trivial issue because compaction cannot invalidate the head MVCC state,
> and replication request will always use the head state in effect at
> request-time.

If it's trivial, then post a patch to JIRA. The best way to make sure
that CouchDB supports the functionality you want is to write the code.
I personally only have a very thin understanding of the complexity
involved in core DB updates. Judging your suggestions against my
understanding worries me in the added code complexity, but I'd be
delighted to be proven wrong so we can support a wider range of

>> Though it'd
>> be rather un-couchy.
> CouchDB has wide applicability, and what you regard as un-couchy is only
> relative to a certain use-case. I'm trying to promote a more generous
> interpretation of what CouchDB is, and can be.

The thing is, your interpretation is asking CouchDB to prove the CAP
theorem incorrect. There are huge billboards saying that CouchDB is
sacrificing consistency to gain availability and partition-tolerance.

> Antony Blakey
> --------------------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
> Human beings, who are almost unique in having the ability to learn from the
> experience of others, are also remarkable for their apparent disinclination
> to do so.
>  -- Douglas Adams

Paul Davis

View raw message