couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Parkin <i...@timparkin.co.uk>
Subject Re: Restricting user interactions to a single document -- was [VOTE] Apache CouchDB 0.9.0 release
Date Fri, 27 Mar 2009 20:36:05 GMT
Brian Candler wrote:
> On Thu, Mar 26, 2009 at 05:00:22PM +0000, Tim Parkin wrote:
>>> In what way is that not atomicity?
>> Well the difference is I'm more interested the ability to rollback than
>> the atomicity.
> 
> But won't you need atomicity to guarantee the ability to rollback?

Yes.. but we're not trying to guarantee consistency, just trying to
prevent inconsistency where possible

> 
> Consider the following sequence where I want to apply changes to A and B,
> but B fails.
> 
> 1. Update A to A'
> 2. Try to update B to B' but it fails
> 3. Revert A' to A
> 
> Now, what happens if someone else updates A in the middle?
> 
> 1. Update A to A'
> 2. Another user updates A' to A"
> 3. Try to update B to B' but it fails
> 4. Err, what do I do now?
> 
> I can't revert A" to A because that would also undo someone else's changes.
> 
> At best, it could be handled like a replication conflict: both A and A"
> exist in the database simultaneously. However, the person making the update
> in (2) saw it as a successful, non-conflicting update.
> 

Yes..

> The next person to read A will (if they ask) see two different versions. A'
> will have vanished if the database has been compacted, making it hard to
> resolve them back into one version.
> 
> You also need to consider what happens when either the database or your
> application crashes in the middle of such a sequence. Unless your
> application maintains a separate transaction log, you would require the
> partial update to be rolled back by the database itself.

At the moment bulk docs are guaranteed atomic if the application crashes
or validation fails (is this right?).. it's only if there is a conflict
that atomicity has been made unavailable.

> 
> In order to make sense of this, can I just step back a bit and work out
> exactly under what circumstances you need to roll back the transaction.
> 
> Is your primary concern concurrency failure? That is, you tried to update B
> to B', but someone else had changed B to B" in the mean time?
> 
> That's fine, but remember that concurrency control only works in the context
> of a single node anyway. A PUT will guarantee that my update from B to B' is
> not stomped on by someone else on the same node trying to update B to B".
> However, as soon as you introduce replication into the mix, all bets are
> off; you *will* get multiple conflicting versions in the database anyway.
> 

Absolutely agree.. However just because we don't have a guarantee of
consistency doesn't mean we shouldn't be looking to increase the
occurence and probability of consistency. i.e. accidents happens but
that doesn't mean you shouldn't be careful..

> In essence I agree with you though: operations which are only atomic on a
> single node are useful (and that includes PUT concurrency control). Not
> everyone has a cluster or plans to go there.
> 
> I also think the reason given on the wiki for dropping _bulk_docs atomic
> operations doesn't make sense. The old behaviour won't work on a sharded
> cluster without two-phase commit. But as far as I can see, the replacement
> "all_or_nothing" mode won't work on a sharded cluster without two-phase
> commit either.
> 

That is my understanding also.. trying to make a single node look like a
 distributed system is not feasible. However I'm hoping that some
compromise is possible and discussion is far from over.

> Of course, what's written on the wiki doesn't necessarily represent the
> views of the authors, who probably have a better reason for including this
> behaviour.
>

I'll be adding to the wiki next week just to increase the entropy :-)

Tim



Mime
View raw message