couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <antony.bla...@gmail.com>
Subject Re: couchdb transactions changes
Date Sun, 08 Feb 2009 06:27:40 GMT
I think discussion of this issue is complicated by the lack of a clear  
exposition of the different ways in which CouchDB may be used/ 
deployed. I have the following in mind:

---------------------------------------

A. A single-node database engine embedded in a desktop application.
B. A single-node database server.
C. A multi-node clustered database server.

Furthermore it might have:

D. No replication or replication from the app purely for backup. No  
conflict is possible.
E. Replication from a distinguished peer that accepts write operations  
e.g. a content/query distribution mechanism. No conflict is possible.
F. Replication in a p2p mesh e.g. collaborative content management.  
Conflict is possible.

Then, in a non-orthogonal way, conflicts are dealt with:

G. Not at all because they can't arise.
H. Replication is under user control, and exclusive with 'normal'  
operation. Conflict resolution is only caused by replication, said  
conflicts being resolved by the user using a specialized UI/Workflow.  
Normal operation sees no conflicts.
I. Replication is concurrent with normal operation, and may or may not  
be under user control. Normal operation sees conflicts.

---------------------------------------

I have a pending deployment project of type A/E/G, and pending  
projects of types A/F/H and B+A/E/G. In all my cases, update and  
indexing throughput is not an issue, although replication efficiency,  
especially of incremental updates to attachments, is a concern.

I understand that there is a sense in which CouchDB was on a  
trajectory pre-Apache to be C/F/I, but I wonder if the desire to  
achieve that isn't *unnecessarily* at the expense of other deployment  
models. In particular, some of these sound like a Notes client, and I  
have heard CouchDB promoted as 'Notes done right', hence my focus on  
those kinds of use cases (as opposed to high-throughput db servers).  
IMO it would be a good thing to not burden these other use cases with  
the operational cost of supporting just one of them.

Obviously supporting transactions in a partition-based cluster can  
impose a cost (although only if the transaction spans the cluster in  
some way, the probability of which is potentially lessened by the  
partitioning), but what if one could turn them off via configuration?

 From what Damien has said about replication, I'm getting the idea  
that it is possible to do replication on an MVCC boundary, in the same  
way that a view represents an MVCC boundary, although I hear loud and  
clear that CouchDB has never, ever, claimed that replication works in  
that manner.

The benefit of a transactional API vs. a conflict based API, for local  
operations, is not only that certain models can only be implemented  
using a transactional API, but the transaction failure mode has a  
clear and simple reflection into the GUI. Users have an expectation of  
transactionality, and IMO domain-dependent conflict resolution (as  
opposed to domain-independent transactionality) is a leap into the  
unknown. I think it's both less natural and more work for the user.

IMO The tradeoff of user-interface model/complexity vs. single/multi- 
node deployment vs. transaction cost should be in the hands of the  
application developer.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

When I hear somebody sigh, 'Life is hard,' I am always tempted to ask,  
'Compared to what?'
   -- Sydney Harris



Mime
View raw message