couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Shumaker <sshuma...@gmail.com>
Subject Re: Removing PUT 409?
Date Tue, 07 Apr 2009 18:55:00 GMT
Fair enough - apparently I've just gotten concerned CouchDB was
heading in a new direction w.r.t. conflicts - especially after the
latest changes made bulk_docs no longer transactional (which is
already causing issues for us in our staging environment.  I'd really
like an option to have it to fail entirely if there's any conflicts).

Do you think there's any merit to to adding a new mode to to bulk_docs
to yield the old transactional behavior - except re-implement it using
hidden revisions (as I suggested earlier) to represent the in-progress
state of the transaction?  This would allow the mode to work with
sharding, at the cost of being twice as expensive (requiring two
writes instead of one).

Scott

On Tue, Apr 7, 2009 at 3:51 AM, Jan Lehnardt <jan@apache.org> wrote:
> Hi Scott,
>
> thanks for raising your concerns. I share it, but I think Brian and
> Adam are only suggesting an optional addition to the existing
> API which leaves the existing case in place. Much like bulk docs
> now has two modes, PUT can have two modes.
>
> Somthing like
>
>  PUT /db/doc?rev=foo&allow_conflicts=true
>  {"json":"body"}
>
> I wouldn't be opposed to add this.
>
> Cheers
> Jan
> --
>
>
> On 6 Apr 2009, at 20:40, Scott Shumaker wrote:
>
>> Just my $0.02, but I think CouchDB is moving in entirely the wrong
>> direction with conflicts in a misguided attempt to make multi-master
>> replication the 'only' way to do things.
>>
>> Very frequently, you need to attempt to resolve a conflict as soon as
>> it occurs - and you often need user interaction to help you resolve
>> the conflict.  Sometimes you may need to just refresh the user to the
>> latest version, other times you may be able to choose one of the
>> versions based on some criteria, sometimes you can automatically merge
>> the two versions, and occasionally you need to ask the user what to
>> do.  This just won't work if the process is happening offline, in a
>> background job.
>>
>> This isn't just true of CouchDB, but of other distributed systems like
>> Dynamo (read the paper, they talk about this exact issue.  Amazon.com
>> has a "merge shopping carts" screen for this exact reason).
>>
>> Getting rid of conflict handling greatly limits the utility of CouchDB
>> for real-world applications (it will certainly force us to adopt
>> another technology instead).  And worse, this is all for the goal of
>> supporting multi-master replication, which really isn't a great
>> technology solution anyway.  If you want durability and scalability,
>> CouchDB should really adopt the much more robust multiple write nodes
>> / read nodes system (with quorum and reconciliation) in Dynamo or a
>> few other distributed key/value stores.
>>
>> Scott
>>
>>
>> On Mon, Apr 6, 2009 at 12:40 AM, Brian Candler <B.Candler@pobox.com>
>> wrote:
>>>
>>> The following is part thought-experiment, part serious suggestion.
>>>
>>> I propose the following: remove all concurrency control from PUT
>>> operations,
>>> and hence also the 409 response. If you PUT a document where the _rev is
>>> not
>>> the same as a 'head' revision, then a new conflicting version is
>>> inserted.
>>> [1]
>>>
>>> The reasoning is as follows:
>>>
>>> 1. Any application which relies on the 409 PUT conflict behaviour is
>>>  not going to work properly in a multi-master replication environment.
>>>  That is: it is protected against concurrent changes on the same node,
>>>  but not on a different node. This is arbitrary.
>>>
>>> 2. The same reasoning was used for getting rid of bulk non-conflicting
>>>  updates. Paraphrasing: "a grown-up CouchDB app which runs on a
>>> replicated
>>>  cluster won't be able to rely on these semantics, so removing this
>>>  capability will encourage you to write your app in a more scalable way.
>>>  You will thank us later."
>>>
>>> 3. A CouchDB app should be written so that it "treats edit conflicts as a
>>>  common state, not an exceptional one" [2]
>>>
>>>  This change will slightly increase the number of these normal conflicts,
>>>  whilst forcing the app writer to deal with them.
>>>
>>> 4. By increasing the number of conflicting versions, it is likely to
>>>  exercise more the underlying code and flush out bugs (for example, more
>>>  fully testing what happens in views when multiple conflicting versions
>>> of
>>>  a document are updated or removed)
>>>
>>> 5. It may highlight more clearly where API improvements are needed to
>>> help
>>>  applications deal with and resolve conflicts. For example:
>>>
>>>  - making it easier for applications to be aware of the existence of
>>>    conflicts (Maybe a GET without _rev should fail if there are multiple
>>>    conflicting revs, or return all of the versions)
>>>
>>>  - given that multiple concurrent clients will see conflicts, and may
>>>    attempt to resolve them at the same time, then it's likely that two
>>>    clients will independently submit exactly the same document content
>>>    after running the conflict-resolution algorithm. It could be helpful
>>>    if these were treated as a single new rev, and not two new conflicts.
>>>
>>> Comments? I would be especially interested in hearing from core
>>> developers
>>> who didn't want bulk non-conflicting updates, but *do* want to retain
>>> single
>>> non-conflicting updates, as to why this is logical.
>>>
>>> Regards,
>>>
>>> Brian.
>>>
>>> [1] You can get this behaviour on 0.9.0 by POSTing to _bulk_docs with
>>> {"all_or_nothing":true}
>>>
>>> [2] http://couchdb.apache.org/docs/overview.html under heading
>>> "Conflicts"
>>>
>>
>
>

Mime
View raw message