couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Couchdb Wiki] Update of "HTTP_Bulk_Document_API" by RobertNewson
Date Sun, 23 Dec 2012 16:19:45 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "HTTP_Bulk_Document_API" page has been changed by RobertNewson:

clarified 'Transactional Semantics with Bulk Updates', mostly by removing obsolete preamble
about ancient versions.

  === Transactional Semantics with Bulk Updates ===
- In previous releases of CouchDB, bulk updates were transactional - in particular, all requests
in a bulk update failed if any request failed or was in conflict. There were a couple of problems
with this approach:
-  * This doesn't actually work with replication. Replication doesn't provide the same transactional
semantics, so downstream replicas won't see "all-or-nothing" transactional semantics. Instead,
they will see documents in an inconsistent state until replication of all documents involved
in the bulk update completes. With bidirectional replication it can get even worse, because
you can get edit conflicts that must be fixed manually.
+ In short, there are none (by design). However, you can ask CouchDB to check that all the
documents in your {{{_bulk_docs}}} request pass all your validation functions. If even one
fails, none of the documents are written. You can select this mode by including {{{"all_or_nothing":true}}}
in your request. With this mode, if all documents pass validation, then all documents will
be updated, even if that introduces a conflict for the affected documents.
+ Bulk updates work independently of replication, the documents updated in a {{{_bulk_docs}}}
request will not be replicated as a group, and will not even necessarily be replicated in
the same order as they were in the request.
-  * If your database is partitioned (aka "sharded"), different documents within the transaction
could live on different nodes in the cluster, and these kinds of transactional semantics don't
work unless you use heavy, non-scalable approaches like two-phase commit.
- With release 0.9 of CouchDB, bulk update semantics have been changed so that a CouchDB server
behaves consistently in a single-node, replicated, and/or partitioned environment. Note that
this change makes explicit the fact that CouchDB is not a relational store and does not guarantee
relational consistency between documents. As a developer you need to be aware of these semantics
and design your data model and your application with this in mind.
- There are now two bulk update models supported:
-  * '''non-atomic''' - This is the default behavior.  Some documents may successfully be
saved and some may not.  The response will tell the application which documents were saved
or not. In the case of a power failure, when the database restarts some may have been saved
and some not.
-  * '''all-or-nothing''' - To use this mode, include {{{"all_or_nothing":true}}} as part
of the request.  In the case of a validation failure, none of the documents will be saved.
 However, it does not do conflict checking, so all documents will be committed even if this
creates conflicts.
- {{{#!highlight javascript
- {
-   "all_or_nothing": true,
-   "docs": [
-     {"_id": "0", "_rev": "1-62657917", "integer": 10, "string": "10"},
-     {"_id": "1", "_rev": "2-1579510027", "integer": 2, "string": "2"},
-     {"_id": "2", "_rev": "2-3978456339", "integer": 3, "string": "3"}
-   ]
- }
- }}}
- In this case, all three documents will be saved, and the response will show success for
all of them. However if the document with id 0 had a conflict, both versions will be present
in the database, with an arbitrary choice made as to which appears in views. You can check
for this status using a GET with {{{?conflicts=true}}}
- If any updates fails validation, all updates will fail.
- All or nothing transactions should not be used to enforce referential integrity, as some
or all updated documents might become losing conflicts during the update. The transaction
should be used to make sure all information is captured in an atomic operation, but conflicts
may need to be addressed later. Applications that rely on this functionality should be able
to tolerate some documents missing or being in a conflicted state until conflict resolution
can occur.
- Bulk updates work independently of replication, meaning document revisions originally saved
as part of an all or nothing transaction will be replicated individually, not as part of a
bulk transaction. This means other replica instances may only have a subset of the transaction,
and if an update is rejected by the remote node during replication (e.g. not authorized error)
the remote node may never have the complete transaction.
- Note that POSTing a single document with {{{"all_or_nothing":true}}} behaves completely
differently from a regular PUT, since it will save conflicting versions rather than rejecting
a conflict.
- {{{
- $ DB=""
- $ curl -X PUT "$DB"
- $ curl -X PUT -d '{"name":"fred"}' "$DB/person"
- $ curl -X POST -H 'Content-Type: application/json' -d '{"all_or_nothing":true,"docs":[{"_id":"person","_rev":"1-877727288","name":"jim"}]}'
- $ curl -X POST -H 'Content-Type: application/json' -d '{"all_or_nothing":true,"docs":[{"_id":"person","_rev":"1-877727288","name":"trunky"}]}'
- $ curl "$DB/person?conflicts=true"
- }}}
- Result:
- {{{#!highlight javascript
- {"ok":true}
- {"ok":true,"id":"person","rev":"1-877727288"}
- [{"id":"person","rev":"2-3595405"}]
- [{"id":"person","rev":"2-2835283254"}]
- {"_id":"person","_rev":"2-3595405","name":"jim","_conflicts":["2-2835283254"]}
- }}}
  === Posting Existing Revisions ===

View raw message