couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <>
Subject Re: Bulk Docs
Date Thu, 12 Mar 2009 22:38:48 GMT
On 13/03/2009, at 1:46 AM, Damien Katz wrote:

> Atomic bulk docs is in the patch, it just doesn't do conflict  
> checking. If any docs are conflicts, they are saved anyway as  
> conflicts. This means it's really for message queue functionality,  
> not database consistency, your data is safe and committed but might  
> not be immediately available or consistent between docs. The reasons  
> we are removing all or nothing with conflict checking as it doesn't  
> work with replication (both offline and clustering) as docs are not  
> replicated in a single transaction or even in update order. And  
> getting it to work with partitioning would cause unacceptable write  
> performances. If we leave it, people will rely on the behavior not  
> understanding it doesn't really work with the rest of CouchDB.
> So if you are currently using bulk docs to guarantee inter-document  
> consistency, it already doesn't work with replication. It only works  
> on a single machine, so no master-slave and no hot stand-by setup  
> would work as neither are guaranteed to be in a consistent state at  
> any point.

The current bulk docs IS useful in a particular scenario.

It allows me, on a single node, to do transactional updates in  
response to e.g. a web submit/AJAX call, without having to expose the  
conflict model to the user and deal with conflicts in my single-node  

I then have two distinct phases of operation for peers:

1. Replication is triggered by the user and they do nothing else until  
replication commpletes, after which they have to resolve the conflicts  
generated by replication. This code deals with conflicts and a  
resolution UI and nothing else.

2. Normal operation - concurrent access by multiple applications,  
multiple users. The code never sees a conflict, and hence the user  
interaction and programming model is considerable simpler

There are a few additional features useful in this model, the  
principal ones being either 1) the ability to roll back a partial  
replication to deal with network failures; or b) the ability to  
maintain monotonic source writes which ensures that each replication  
step is consistent. To date neither of these features have gained  
sufficient community support to be considered.

I've presented this model before, and it has been rejected as being  
incompatible with the initial couchdb intentions, but in response to  
Tim Parkin, this is the reason for my fork. There are more details to  
my effort - pure binary bodies rather than JSON, unification of  
attachments with documents, strict metadata/content separation, map/ 
reduce over arbitrary data, generalised derivation, an immutable model  
of fully reified state, replication of operations rather than data -  
but maybe anyone interested can contact me offlist - it's no longer  
CouchDB and I'm sure everyone's sick of saying/reading "forget it,  
it's not going to happen" :)

Antony Blakey
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

One should respect public opinion insofar as is necessary to avoid  
starvation and keep out of prison, but anything that goes beyond this  
is voluntary submission to an unnecessary tyranny.
   -- Bertrand Russell

View raw message