couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Regarding `_bulk_docs` atomicity [Was: Re: How to find the change sequence of a document's revision]
Date Sun, 23 Dec 2012 10:33:15 GMT
Ok, I've done a bit of code diving and some testing and can confirm
that the wiki page is accurate.

➜  ~  curl -Hcontent-type:application/json
localhost:5984/db1/_bulk_docs -d '{"all_or_nothing":true, "docs":
[{"_id":"doc1"}, {"_id":"doc2"}, {"_id":"doc3","fail":true}]}'
[{"id":"doc3","rev":"0-","error":"forbidden","reason":"fail"}]
➜  ~  curl localhost:5984/db1/doc1
{"error":"not_found","reason":"missing"}

I have a validate_doc_update that fails any document with a
"fail":true property.

The above output confirms that doc1 is not created because doc3 failed
validation. So, this is atomic across the group. But note that this
only applies to validation. If doc3 had not caused the commit to
abort, then doc1 and doc2 would both have been updated as well, even
if this introduces a conflict.

So, to take this back to the first post in the thread, all_or_nothing
doesn't provide the semantics you need, because it will introduce
conflicts rather than abort. The 0.9 and earlier behavior did provide
the semantics you need.

BigCouch does not support all_or_nothing at present, though I think it
could, given that it's only a question of validating the contents of
the bulk_docs body, the fact that the documents would then end up on
different machines doesn't seem like an impediment.

That said, is the current all_or_nothing:true behavior valuable to
many people, given these details? I don't know.

B.


On 23 December 2012 10:16, Robert Newson <rnewson@apache.org> wrote:
> Well, there's a core design goal that if your application works on a
> single CouchDB node, it'll work in a 100 node CouchDB cluster without
> modification. This requires us to prevent you shooting yourself in the
> foot. It's why you can only atomically update a single document, etc.
> I owe the wiki a detailed and accurate explanation of the post-0.9
> all_or_nothing:true behavior too (as soon as I relearn what it is!).
>
> I understand the basic idea, though, and there is a desire to better
> modularize the internals. We'd like you to be able to take the core
> code (the database engine itself) and build on top of that. It won't
> be CouchDB, but it will leverage proven code, and you could build even
> the original all-or-nothing semantics with it.
>
> B.
>
>
> On 23 December 2012 10:06, Ciprian Dorin Craciun
> <ciprian.craciun@gmail.com> wrote:
>> On Sun, Dec 23, 2012 at 11:06 AM, Robert Newson <rnewson@apache.org> wrote:
>>> It's been my view for some time that this option should be removed.
>>> BigCouch doesn't support it, so unless some extraordinary effort is
>>> made during the merge, it'll go away at that point.
>>
>>
>>     Couldn't the CouchDB team adopt a little different approach to
>> such "items" (the (B) choice from below):
>>
>>     (A) One direction which I feel CouchDB is going towards is going
>> for the "common denominator" between most possible CouchDB-like
>> implementations. That is because some things are hard to be
>> implemented in a distributed environment (like BigCouch), or some
>> other things might be hard to implement in an embedded environment
>> (like TouchDB), such options are gradually being "eliminated" from the
>> CouchDB API.
>>
>>     (B) Another option -- one that I would prefer -- is going with
>> something like "capabilities", where for some operations the user
>> explicitly asks what "capabilities" to be enabled or disabled, and if
>> a particular implementation doesn't support them it should fail the
>> request. This would allow some implementations to expose more (or
>> less) functionality than others. (Of course such a "solution" should
>> be applied with care, or it'll generate a miriad of possible
>> semantics.)
>>
>>     Thus strictly related with the bulk operations atomicity, I would
>> see it like this:
>>     * the `all_or_nothing` option should be kept;
>>     * its default value should be the most generic one -- the one
>> giving the least promises;
>>     * each CouchDB implementation could choose to disallow certain values;
>>
>>     (This is because in my particular case I want to use only the
>> "standard" single-instance / no-replication CouchDB.)
>>
>>     Ciprian.

Mime
View raw message